python Programming Glossary: findall

How to find overlapping matches with a regexp?

http://stackoverflow.com/questions/11430863/how-to-find-overlapping-matches-with-a-regexp

to find overlapping matches with a regexp match re.findall r' w w' 'hello' print match 'he' 'll' Since w w means two characters.. But why do 'el' and 'lo' not match the regex match1 re.findall r'el' 'hello' print match1 'el' python regex overlapping .. python regex overlapping share improve this question findall doesn't yield overlapping matches by default. This expression..

How do you translate this regular-expression idiom from Perl into Python?

http://stackoverflow.com/questions/122277/how-do-you-translate-this-regular-expression-idiom-from-perl-into-python

_set_match re_.search pattern string flags @wraps re_.findall def findall pattern string flags 0 matches re_.findall pattern.. re_.search pattern string flags @wraps re_.findall def findall pattern string flags 0 matches re_.findall pattern string flags.. re_.findall def findall pattern string flags 0 matches re_.findall pattern string flags if matches _set_match matches 1 return..

Parsing XML with namespace in Python ElementTree

http://stackoverflow.com/questions/14853243/parsing-xml-with-namespace-in-python-elementtree

code tree ET.parse filename root tree.getroot root.findall 'owl Class' Because of the namespace I am getting the following.. not too smart about namespaces. You need to give the .find findall and iterfind methods an explicit namespace dictionary. This.. 'http www.w3.org 2002 07 owl#' # add more as needed root.findall 'owl Class' namespaces namespaces Prefixes are only looked up..

python regular expression: re.findall(r“(do|re|mi)+”,“mimi rere midore”)

http://stackoverflow.com/questions/15547033/python-regular-expression-re-findallrdoremi-mimi-rere-midore

regular expression re.findall r&ldquo do re mi &rdquo &ldquo mimi rere midore&rdquo I couldn't.. I couldn't understand why this regular expression re.findall r do re mi mimi rere midore generates this result 'mi' 're'.. 'midore' ... However when I use this regular expression re.findall r do re mi mimi rere midore it generates the result as expected...

Beautiful Soup findAll doen't find them all

http://stackoverflow.com/questions/16322862/beautiful-soup-findall-doent-find-them-all

the half of them... python python 3.x beautifulsoup findall share improve this question Different HTML parsers deal..

How to split a string by commas positioned outside of parenthesis?

http://stackoverflow.com/questions/1648537/how-to-split-a-string-by-commas-positioned-outside-of-parenthesis

share improve this question One way to do it is to use findall with a regex that greedily matches things that can go between.. Elvis Presley Jane Doe Jane Doe r re.compile r' ^ ^ ' r.findall s 'Wilbur Smith Billy son of John ' ' Eddie Murphy John ' '..

How do I ensure that re.findall() stops at the right place?

http://stackoverflow.com/questions/17765805/how-do-i-ensure-that-re-findall-stops-at-the-right-place

do I ensure that re.findall stops at the right place Here is the code I have a ' title.. aaa title title aaa2 title title aaa3 title ' import re re.findall r' title . title ' a The result is 'title' 'aaa title title.. a title for the web site. My question is how do I limit findall to a single title title python regex python 2.7 findall share..

Python re.search

http://stackoverflow.com/questions/20240239/python-re-search

Hello and the word World is not matched. When is used re.findall I could get both Hello and World . My question is why we can't.. the string. In order to match every occurrence you need re.findall documentation Return all non overlapping matches of pattern.. regex.search 123hello456world789 .groups 'hello' # using findall we get every item. regex.findall 123hello456world789 'hello'..

Crawler doesn't run because of error in htmlfile = urllib.request.urlopen(urls[i])

http://stackoverflow.com/questions/20308043/crawler-doesnt-run-because-of-error-in-htmlfile-urllib-request-urlopenurlsi

urls i htmltext htmlfile.read print htmltext titles re.findall pattern htmltext print titles i 1 But i'm having this error.. python crawler scrapper 2 0.py line 17 in module titles re.findall pattern htmltext File C Python33 lib re.py line 201 in findall.. pattern htmltext File C Python33 lib re.py line 201 in findall return _compile pattern flags .findall string TypeError can't..

How do I match contents of an element in XPath (lxml)?

http://stackoverflow.com/questions/2637760/how-do-i-match-contents-of-an-element-in-xpath-lxml

'Example' 0 .tag If case you would like to use iterfind findall find findtext keep in mind that advanced features like value.. . lxml.etree supports the simple path syntax of the find findall and findtext methods on ElementTree and Element as known from..

Why doesn't finite repetition in lookbehind work in some flavors?p

http://stackoverflow.com/questions/3159524/why-doesnt-finite-repetition-in-lookbehind-work-in-some-flavorsp

to use a capturing group instead ^ d 1 2 d 1 2 Note that findall returns what group 1 captures if you only have one group. Capturing.. on ideone.com p re.compile r' ^ d ^ d d d 1 2 ' print p.findall 12 34 56 # 34 print p.findall 1 23 45 # 23 p re.compile r'^.. r' ^ d ^ d d d 1 2 ' print p.findall 12 34 56 # 34 print p.findall 1 23 45 # 23 p re.compile r'^ d 1 2 d 1 2 ' print p.findall..

Find the indexes of all regex matches in Python?

http://stackoverflow.com/questions/3519565/find-the-indexes-of-all-regex-matches-in-python

triple quoted and such strings at the moment . When I use findall I get a list of the matching strings which is somewhat nice..

Is there a Perl equivalent of Python's re.findall/re.finditer (iterative regex results)?

http://stackoverflow.com/questions/467800/is-there-a-perl-equivalent-of-pythons-re-findall-re-finditer-iterative-regex-r

there a Perl equivalent of Python's re.findall re.finditer iterative regex results In Python compiled regex.. regex results In Python compiled regex patterns have a findall method that does the following Return all non overlapping matches..

Parse XML file into Python object

http://stackoverflow.com/questions/5530857/parse-xml-file-into-python-object

file encspot tree et.fromstring sxml for el in tree.findall 'file' print ' ' for ch in el.getchildren print ' 15 30 '.format.. to create a data structure ch.tag ch.text for e in tree.findall 'file' for ch in e.getchildren Which creates a list of tuples.. is on item.tag item.text for item in ch for ch in tree.findall 'file' 'Bitrate' '131' 'Name' 'some filename.mp3' 'Encoder'..

Regular expression group capture with multiple matches

http://stackoverflow.com/questions/5598340/regular-expression-group-capture-with-multiple-matches

work this way. A possible solution is to try to use findall or similar. r re.compile r' w' r.findall x # 'a' 'b' 'c' 'd'..

Need python lxml syntax help for parsing html

http://stackoverflow.com/questions/603287/need-python-lxml-syntax-help-for-parsing-html

able to figure out self.mySearchTables self.mySearchTree.findall . table self.myResultRows self.mySearchTables 1 .findall . tr.. . table self.myResultRows self.mySearchTables 1 .findall . tr I need to find the links contained in this table this is.. for searchRow in self.myResultRows searchLink patentRow.findall . a It doesn't seem to actually locate the link elements. I..

Beautiful Soup cannot find a CSS class if the object has other classes, too

http://stackoverflow.com/questions/1242755/beautiful-soup-cannot-find-a-css-class-if-the-object-has-other-classes-too

a page has div class class1 and p class class1 then soup.findAll True 'class1' will find them both. If it has p class class1..

Beautiful Soup findAll doen't find them all

http://stackoverflow.com/questions/16322862/beautiful-soup-findall-doent-find-them-all

Soup findAll doen't find them all i'm trying to parse a website and get.. to parse a website and get some info with BeautifulSoup.findAll but it doesn't find them all.. I'm using python3 the code is.. page.read soup BeautifulSoup page.read manga_img soup.findAll 'a' 'class' 'manga_img' limit None for manga in manga_img print..

BeautifulSoup - easy way to to obtain HTML-free contents

http://stackoverflow.com/questions/1752662/beautifulsoup-easy-way-to-to-obtain-html-free-contents

this code to find all interesting links in a page soup.findAll 'a' href re.compile '^notizia.php idn d ' And it does its job.. In the documentation it says to use text True in findAll method but it will ignore my regex. Why How can I solve that.. I've used this def textOf soup return u''.join soup.findAll text True So... texts textOf n for n in soup.findAll 'a' href..

BeautifulSoup Grab Visible Webpage Text

http://stackoverflow.com/questions/1936466/beautifulsoup-grab-visible-webpage-text

I can't figure out what are the right arguments to findAll http www.crummy.com software BeautifulSoup documentation.html#arg.. .read soup BeautifulSoup.BeautifulSoup html texts soup.findAll text True def visible element if element.parent.name in 'style'..

Using Beautiful Soup Python module to replace tags with plain text

http://stackoverflow.com/questions/2061718/using-beautiful-soup-python-module-to-replace-tags-with-plain-text

get picked up by the parser as content div results soup.findAll text lambda x len x 20 When I use the above code to get at the.. to replace the tag with plain text as follows anchors soup.findAll 'a' for a in anchors a.replaceWith 'plain text' The above does.. and that causes the same problem when I use findAll with the len x 20. I can use regular expressions to parse the..

Matching id's in BeautifulSoup

http://stackoverflow.com/questions/2830530/matching-ids-in-beautifulsoup

... div ' soupHandler BeautifulSoup html print soupHandler.findAll 'div' id 'post ' python beautifulsoup share improve this.. share improve this question You can pass a function to findAll print soupHandler.findAll 'div' id lambda x x and x.startswith.. You can pass a function to findAll print soupHandler.findAll 'div' id lambda x x and x.startswith 'post ' div id post 45..

Extracting readable text from HTML using Python?

http://stackoverflow.com/questions/3172343/extracting-readable-text-from-html-using-python

to separate them. htmlDom BeautifulSoup webPage htmlDom.findAll text True Alternately from stripogram import html2text extract.. of script tags with BeautifulSoup nonscripttags htmlDom.findAll lambda t t.name 'script' recursive False will do that for you.. children which are non script tags and a separate htmlDom.findAll recursive False text True will get strings that are immediate..

BeautifulSoup getting href [duplicate]

http://stackoverflow.com/questions/5815747/beautifulsoup-getting-href

BeautifulSoup before version 4 the name of this method is findAll . In version 4 BeautifulSoup's method names were changed to..

How to find tag with particular text with Beautiful Soup?

http://stackoverflow.com/questions/9007653/how-to-find-tag-with-particular-text-with-beautiful-soup

You can pass a regular expression to the text parameter of findAll like so import BeautifulSoup import re columns soup.findAll.. like so import BeautifulSoup import re columns soup.findAll 'td' text re.compile 'your regex here' attrs 'class' 'pos' ..