VCc?@sdZdZdZdZdddddgZd Zd eZd Zd gZd Z d dgZ ddk Z ddk Z ddk Z ddkZddkZddkZddkZddkZddkZddkZddkZyddklZWnddklZnXyddkZWn eZnXyddkZWn eZnXy6ddkZeiieddk l!Z"dZ#Wnd Z#dZ"nXyddk$Z$ddk%Z%WneZ$Z%nXyddk&Z'WnnXyddk(Z(WnnXy3ddk)Z)eoddk*Z)de)i+_nWn eZ)nXde,fdYZ-de-fdYZ.de-fdYZ/de-fdYZ0de,fdYZ1e i2de _3e i2d e _4e i2d!e _5hd"d#6d$d%6d&d'6d(d)6d*d+6d,d-6d.d/6d0d16d2d36d4d56d6d76d8d96d:d;6d<d=6d>d?6d@dA6dBdC6Z6y e7Z8Wn,e9j o ddDk8l8Z8dEZ7nXdFe8fdGYa:dHZ;ea<dIZ=e i2dJZ>dKZ?dLfdMYZ@e#o&dNe@eiiAiBfdOYZCndPe iDfdQYZEdRe@eEfdSYZFdTeEfdUYZGdVZHdWeEfdXYZIdYZJdZeiKeiLeiMfd[YZNd\ZOgZPd]ZQd^d_d`dadbdcdddedfdgdhdidjd#gZRgZSeRD]aZTeSeTiUdcdkiUdldmiUdndoiUdpdqiUdrdsiUdjdtdudvdwq[SZV[TgZWeVD]ZXeWe i2eXiYq[WZZ[XdxZ[eQe[dyZ\dzZ]d{Z^d|Z_d}Z`e i2d~e\e]e^fZae i2de_e`fZbdZceQecdZdeQede i2dZedZfeQefhdd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6Zghdd6dd6dd6dd6dd6dd6dd6Zhe i2dZidZjeQejh dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6dd6Zke i2dZldZmeQemdZneQendZohdd6dd6dd6dd6dd6ZpeiqirepeQeodZsdZtdZudZveeeegdZwexdjo{e iyd oeGHe izd ne iydZ{e;ddk|l|Z|x+e{D]#Z}e}GHHewe}Z~e|e~HqWndS(sUUniversal feed parser Handles RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 feeds Visit http://feedparser.org/ for the latest version Visit http://feedparser.org/docs/ for the latest documentation Required: Python 2.1 or later Recommended: Python 2.3 or later Recommended: CJKCodecs and iconv_codec s4.1sCopyright (c) 2002-2006, Mark Pilgrim, All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.s'Mark Pilgrim s%Jason Diamond s'John Beimler s1Fazal Majid s"Aaron Swartz s(Kevin Marks is.UniversalFeedParser/%s +http://feedparser.org/sapplication/atom+xml,application/rdf+xml,application/rss+xml,application/x-netcdf,application/xml;q=0.9,text/xml;q=0.2,*/*;q=0.1t drv_libxml2tuTidytmxTidyiN(tStringIO(tescapeicCs:|idd}|idd}|idd}|S(Nt&s&t>s>ti?iiiiiiiiii`i:i#i@i'i=i"iiaibicidieifigihiiiiiiiiiijikiliminioipiqiriiiiiiii~isitiuiviwixiyiziiiiiiiiiiiiiiiiiiiiiii{iAiBiCiDiEiFiGiHiIiiiiiii}iJiKiLiMiNiOiPiQiRiiiiiii\iiSiTiUiViWiXiYiZiiiiiii0i1i2i3i4i5i6i7i8i9iiiiiiiRi(iiiiii iiiiii i i iiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiii[i.i<i(i+i!i&iiiiiiiiii]i$i*i)i;i^i-i/iiiiiiiii|i,i%i_i>i?iiiiiiiiii`i:i#i@i'i=i"iiaibicidieifigihiiiiiiiiiijikiliminioipiqiriiiiiiii~isitiuiviwixiyiziiiiiiiiiiiiiiiiiiiiiii{iAiBiCiDiEiFiGiHiIiiiiiii}iJiKiLiMiNiOiPiQiRiiiiiii\iiSiTiUiViWiXiYiZiiiiiii0i1i2i3i4i5i6i7i8i9iiiiii(t_ebcdic_to_ascii_maptstringt maketranstjointmaptchrtranget translate(tstemapRn((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_ebcdic_to_asciis* :s&^([A-Za-z][A-Za-z0-9+-.]*://)(/*)(.*?)cCs"tid|}ti||S(Ns\1\3(t _urifixertsubturlparseturljoin(tbaseturi((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_urljoinst_FeedParserMixinc BseZh;dd6dd6dd6dd6dd6dd6dd6dd6dd6dd 6dd 6dd 6dd 6d d6dd6dd6dd6dd6dd6dd6dd6dd6dd 6d!d"6d#d$6d%d&6d'd(6d)d*6d+d,6d-d.6d/d06d1d26d3d46d5d66d5d76d8d96d:d;6d<d=6d>d?6d@dA6dBdC6dDdE6dFdG6dHdI6dJdK6dLdM6dNdO6dPdQ6dRdS6dTdU6dVdW6dXdY6dZd[6d\d]6d^d_6d`da6dbdc6ddde6dfdg6ZhZdhdidjdkdldmdndodpdqdrg Zddsdtdudvdwdxdydzg Zddsdtdudvdwdxdydzg Zd{d|gZddd}d~Z dZ dZ dZ dZ ddZdZdZdZdZdZdZdZdZdZddZdZdZdZdZdZdZdZdZ dZ!dZ"e"Z#dZ$dZ%dZ&e&Z'dZ(dZ)dZ*e*Z+dZ,e,Z-dZ.e.Z/e.Z0e.Z1e.Z2dZ3e3Z4e3Z5e3Z6e3Z7dZ8dZ9dZ:dZ;dZ<dZ=dZ>e>Z?dZ@e@ZAdZBdZCdZDdZEdZFeFZGeFZHdZIeIZJeIZKdZLeLZMdZNeNZOdZPddZQdZRddZSdZTeTZUeTZVdZWeWZXeWZYdZZeZZ[eZZ\dZ]e]Z^e]Z_dZ`e`Zae`ZbdZcecZddZeeeZfdZgegZhdZieiZjdZkekZldZmemZnemZodZpepZqepZrdZsesZtesZuesZvesZwdZxexZyexZzexZ{exZ|dZ}e}Z~dZeZdZdZdZdZdZdZdZeZeZdZdZdZeZeZeZdZdZeZdZeZdZdZdZeZeZdZeZeZdZdZdZeZdZeZdZeZdZdZdZdZdZeZdZeZdZdZdZdZdZdZeZdZeZdZeZeZeZeZeZdZeZdZdZRS(Rshttp://backend.userland.com/rsss%http://blogs.law.harvard.edu/tech/rssshttp://purl.org/rss/1.0/s&http://my.netscape.com/rdf/simple/0.9/shttp://example.com/newformat#shttp://example.com/nechoshttp://purl.org/echo/suri/of/echo/namespace#shttp://purl.org/pie/shttp://purl.org/atom/ns#shttp://www.w3.org/2005/Atoms'http://purl.org/rss/1.0/modules/rss091#tadminshttp://webns.net/mvcb/tags,http://purl.org/rss/1.0/modules/aggregation/tannotates)http://purl.org/rss/1.0/modules/annotate/taudios!http://media.tangent.org/rss/1.0/t blogChannels-http://backend.userland.com/blogChannelModuletccshttp://web.resource.org/cc/tcreativeCommonss4http://backend.userland.com/creativeCommonsRssModuletcos'http://purl.org/rss/1.0/modules/companytcontents(http://purl.org/rss/1.0/modules/content/tcps&http://my.theinfo.org/changed/1.0/rss/tdcs http://purl.org/dc/elements/1.1/tdctermsshttp://purl.org/dc/terms/temails&http://purl.org/rss/1.0/modules/email/tevs&http://purl.org/rss/1.0/modules/event/t feedburners*http://rssnamespace.org/feedburner/ext/1.0tfmshttp://freshmeat.net/rss/fm/tfoafshttp://xmlns.com/foaf/0.1/tgeos(http://www.w3.org/2003/01/geo/wgs84_pos#ticbmshttp://postneo.com/icbm/timages&http://purl.org/rss/1.0/modules/image/tituness*http://www.itunes.com/DTDs/PodCast-1.0.dtds'http://example.com/DTDs/PodCast-1.0.dtdtls%http://purl.org/rss/1.0/modules/link/tmediashttp://search.yahoo.com/mrsstpingbacks4http://madskills.com/public/xml/rss/module/pingback/tprisms.http://prismstandard.org/namespaces/1.2/basic/trdfs+http://www.w3.org/1999/02/22-rdf-syntax-ns#trdfss%http://www.w3.org/2000/01/rdf-schema#trefs*http://purl.org/rss/1.0/modules/reference/treqvs*http://purl.org/rss/1.0/modules/richequiv/tsearchs'http://purl.org/rss/1.0/modules/search/tslashs&http://purl.org/rss/1.0/modules/slash/tsoaps)http://schemas.xmlsoap.org/soap/envelope/tsss.http://purl.org/rss/1.0/modules/servicestatus/tstrs-http://hacks.benhammersley.com/rss/streaming/Rys-http://purl.org/rss/1.0/modules/subscription/tsys,http://purl.org/rss/1.0/modules/syndication/ttaxos)http://purl.org/rss/1.0/modules/taxonomy/tthrs*http://purl.org/rss/1.0/modules/threading/ttis*http://purl.org/rss/1.0/modules/textinput/t trackbacks5http://madskills.com/public/xml/rss/module/trackback/twfws$http://wellformedweb.org/commentAPI/twikis%http://purl.org/rss/1.0/modules/wiki/txhtmlshttp://www.w3.org/1999/xhtmltxmls$http://www.w3.org/XML/1998/namespacetszfs/http://schemas.pocketsoap.com/rss/myDescModule/tlinkR0t wfw_commenttwfw_commentrsstdocsR:R9tcommentstlicenseticontlogottitleR7tinfoRER6RBRAR8s text/htmlsapplication/xhtml+xmlsutf-8cCs^totiidn|ip7x4|iiD]\}}||i|i                       cCstotiid||fng}|D]\}}||i|fq0~}g}|D]3\}}|||d#jo |ip|fqc~}t|}|id|idp|i}t|i||_|id|id} | djo d} n| djo |i } n| o"|d$jo| |i d s %s="%s"RiiRaRRR8tnameR:R9twidththeightt_start_(Rstype(sfeedRsrdf:RDF(stitleslinks descriptionsname(stitleslinks descriptionsurlshrefswidthR(RRRRRR*RORR~RiRRRtappendRRdttrackNamespaceRRRStendswithtsplitt handle_dataRptfindRRRtgetattrR_tpush(RTRWtattrsRVR(R)t_[2]tattrsDRRtprefixR}t_[3]tttsuffixt methodnametmethod((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytunknown_starttagsZ3G %        =# F   cCstotiid|n|iddjo|idd\}}nd|}}|ii||}|o|d}nd||}yt||}|Wn$t j o|i ||nX|i oD|i i do1|i idd id  od |i dRi(RRRRRRRRORR_tpopRRRSRRRRRR(RTRWRRRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytunknown_endtags6  =#     c Cs|ipdS|i}|djod |}nJ|d d jot|dd}n t|}t|id}|iddi|dS(Nt34t38t39t60t62tx22tx26tx27tx3ctx3es&#%s;itxiisutf-8ii( RRRRRRRRRR(RRtinttunichrtencodeR(RTRttexttc((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pythandle_charrefs    cCs|ipdStotiid|n|d jod|}nSd}y||Wntj od|}nXt||id }|id d i|dS( Ns"entering handle_entityref with %s tlttgttquottamptaposs&%s;cSsqddk}t|do |i|S|i|}|ido"|idot|dd!St|S(Nitname2codepoints&#t;i(thtmlentitydefsR^Rt entitydefsRdRRtord(R(R ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytname2cps    sutf-8ii(RRRRR( RRRRRRcRRR(RTRRR ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pythandle_entityrefs   icCs[|ipdS|o)|iiddjot|}n|iddi|dS(NRPsapplication/xhtml+xmlii(RRROR R(RTRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs   cCsdS(N((RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pythandle_commentscCsdS(N((RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt handle_pi!scCsdS(N((RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt handle_decl%scCstotiidn|i||d!djob|iid|}|djot|i}n|it|i|d|!d|dS|iid|}|d SdS( Nsentering parse_declaration i s iiiRi( RRRRtrawdataRtlenRR (RTtiR(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytparse_declaration(s $ cCsU|i}|djo d}n/|djo d}n|djo d}n|S(NRs text/plainthtmls text/htmlRsapplication/xhtml+xml(R(RTt contentType((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytmapContentType4s       cCs|i}||fd jo|i o d|_n|djo|i o d|_n|djo|i o d|_n|iddjod }|}n|ii|o,|i||i|<||i|i|s     cCst|ipd|S(NR(R~R(RTR}((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt resolveURIPscCs|S(N((RTtelementR ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytdecodeEntitiesSscCs|ii||ggdS(N(RR(RTRt expectingText((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRVscCsU|ipdS|idd|jodS|ii\}}}di|}|o|i}n|p|Sto[|iiddoEyti|}Wqti j oqti j oqXn||i jo|o|i |}n|iiddp|i ||}ny|id=Wntj onXy|id=Wntj onX|i|iidd|ijo0||ijot||i|i}qn|i|iidd|ijo*||ijot||i}q n|io=t|tdjo$yt||i}WqgqgXn|d jo|S|io|i o |d joO|idi|gti|i}||d <|id|i|qQ|d jo9||id|<|o||idd ddt _parse_date(RTR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_end_publishedscCs|idddS(NR2i(R(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_updated$scCs/|id}t|}|id|dS(NR2R4(RR}R>(RTR[t parsed_value((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _end_updated+s cCs|idddS(Ntcreatedi(R(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_created4scCs)|id}|idt|dS(NRtcreated_parsed(RR>R}(RTR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _end_created8scCs|idddS(Ntexpiredi(R(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_expirationdate=scCs#|idt|iddS(Ntexpired_parsedR(R>R}R(RT((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_end_expirationdate@scCsV|idd|i|d}|o|iddi|n|iddS(NRis rdf:resourceii(RR<RRR(RTRR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_cc_licenseCs cCs|idddS(NRi(R(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_creativecommons_licenseJscCs|iddS(NR(R(RT((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_end_creativecommons_licenseMscCs|i}|idg}| o| o | odSth|d6|d6|d6}||jo,|ith|d6|d6|d6ndS(NRIRJRLtlabel(R0R]R+R(RTRJRLRR4RIR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_addTagPs ! cCstotiidt|n|id}|id|id}|id}|i||||idddS(Ns!entering _start_category with %s RJRLtdomainRRHi(RRRRtreprRORR(RTRRJRLR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_categoryXscCs7x0|idiD]}|i|ddqWdS(Ntitunes_keywordsshttp://www.itunes.com/(RRRRi(RTRJ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_end_itunes_keywordsbscCs0|i|iddd|idddS(NRshttp://www.itunes.com/RHi(RRORiR(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_start_itunes_categoryfscCs~|id}|pdS|i}|d}|o/t|o"|dd o||ddRuR0RS(RTR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _end_guids* cCs1|id|d|ip|ip|idS(NRs text/plain(R7RRR(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _start_titlescCsW|id}|i}|io||dd|i|}|ido|i|d|d(RTtcopyToDescriptionR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR$s+cCs:|iddth|idd6|idRARBRDt_start_feedinfoRCRJRKt _end_feedRLRMROt_start_textInputRPt_end_textInputRRt_start_managingeditort_start_dc_authort_start_dc_creatort_start_itunes_authorRTt_end_managingeditort_end_dc_authort_end_dc_creatort_end_itunes_authorRVRWRZR[R\R^R_t_start_itunes_nameR]t_end_itunes_nameRbRcRdReRft_start_homepaget _start_uriRgt _end_homepaget_end_uriRht_start_itunes_emailRit_end_itunes_emailR0R`RaRSRpt_start_taglinet_start_itunes_subtitleRqt _end_taglinet_end_itunes_subtitleRrt_start_dc_rightst_start_copyrightRst_end_dc_rightst_end_copyrightRvt _start_entryt_start_productRwt _end_entryRxt_start_languageRyt _end_languageRzt_start_webmasterR{t_end_webmasterR|t_start_dcterms_issuedt _start_issuedR~t_end_dcterms_issuedt _end_issuedRRFt_start_dcterms_modifiedt_start_pubdatet_start_dc_dateRRGt_end_dcterms_modifiedt _end_pubdatet _end_dc_dateRt_start_dcterms_createdRt_end_dcterms_createdRRRRRRRt_start_dc_subjectt_start_keywordsRRRt_end_dc_subjectt _end_keywordst_end_itunes_categoryRRHt_start_producturlRIt_end_producturlRRRt_start_dc_titlet_start_media_titleRt _end_dc_titlet_end_media_titleRRRt _end_abstractRt!_start_feedburner_browserfriendlyRt_end_feedburner_browserfriendlyRRRRRt_start_itunes_summaryRt_end_itunes_summaryRRRRRRt_start_xhtml_bodyRt_start_fullitemRt _end_bodyt_end_xhtml_bodyt_end_content_encodedt _end_fullitemt _end_prodlinkRt_start_itunes_linkRR(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR s  '!! " F %           Y                                                                                    t_StrictFeedParsercBsGeZdZdZdZdZdZdZdZRS(cCs]totiidntiiii|t i||||d|_ d|_ dS(Nstrying StrictFeedParser i( RRRRRtsaxthandlertContentHandlerRRtbozoRitexc(RTRRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR>s  cCs|i||dS(N(R(RTRR}((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytstartPrefixMappingEsc Cs"|\}}t|pdi}|iddjod}|}n|o-|iddjo|idd}nd}|ii||}|oL|djp|djo|djo%|ii| ot d|n|o|d|}nt|i}t o0t i i d|||||i|fnh} xz|iiD]i\\}} } |pdi}|ii|d}|o|d| } n| | t| is>s't's"t"u( Rltcompilet IGNORECASERyR/RRRPRR*R+R,(RTR ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR,s!#c Cs~g}|D]\}}||i|fq ~}g}|D]3\}}|||djo |ip|fq>~}|S(NRRP(srelstype(R(RTRRVR(R)R((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytnormalize_attrss3Gc Cstotiid|ng}xb|D]Z\}}t|tdjot||i}n|it||i|fq,Wdig}|D]\}}|d||fq~i |i}||i jo|i idt n|i idt dS(Ns-_BaseHTMLProcessor, unknown_starttag, tag=%s uu %s="%s"s<%(tag)s%(strattrs)s />s<%(tag)s%(strattrs)s>( RRRRRPR-RRRpRR-R2tlocals(RTRWRtuattrsRUR[RVtstrattrs((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs #FcCs/||ijo|iidtndS(Ns (R-R2RR5(RTRW((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs|iidtdS(Ns &#%(ref)s;(R2RR5(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs|iidtdS(Ns &%(ref)s;(R2RR5(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs3totiid|n|ii|dS(Ns)_BaseHTMLProcessor, handle_text, text=%s (RRRRR2R(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs|iidtdS(Ns(R2RR5(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs|iidtdS(Ns (R2RR5(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCs|iidtdS(Ns (R2RR5(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRss-zA-Z][-_.a-zA-Z0-9:]*\s*cCs|i}t|}||jodS|i||}|oK|i}|i}|t||jodS|i|ifS|i|dSdS(Ni(Ni(Ni(Ni( RRRit_new_declname_matchRmR"RtendR(RTRt declstartposRtntmRuR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _scan_names      cCs1dig}|iD]}|t|q~S(s(Return processed HTML as a single stringR(RpR2R(RTRVtp((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR3s(R R R-RR,R/R,R4RRRRRRRRRlR2R.R8R=R3(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs$             t_LooseFeedParsercBseZdZdZRS(cCs*tii|ti||||dS(N(R*R+RR(RTRRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRscCsI|idd}|idd}|idd}|idd}|idd}|id d}|id d }|id d }|id d}|idd}|iido~|iiddid o^|idd}|idd}|idd}|id d}|idd}n|S(Ns<s<s<s>s>s>s&s&s&s"s"s"s's's'RPRRRRR1R0(RRRSROR(RTRR ((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs"3(R R RR(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR?s t_RelativeURIResolvercBsteZdd d!d"d#d$d%d&d'd(d)d*d+d,d-d.d/d0d1d2d3d4d5d6d7gZdZdZdZRS(8taR9tapplettcodebaseRt blockquotetcitetbodyt backgroundtdeltformtactionR#tlongdescRtiframetheadtprofileR%tusemapR&tinsRtobjecttclassidR tqtscriptcCsti||||_dS(N(RRR(RTRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR+scCst|i|S(N(R~R(RTR}((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR/scCsy|i|}g}|D]?\}}||||f|ijo|i|p|fq~}ti|||dS(N(R4t relative_urisRRR(RTRWRRVRUR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR2sS(RAshref(RBRC(sareashref(RDRE(RFRG(RHRE(RIsaction(sframeRK(sframessrc(RLRK(RLssrc(sheadRN(simgRK(simgssrc(simgRO(sinputssrc(sinputRO(RPRE(slinkshref(sobjectRR(sobjectRC(sobjectsdata(sobjectRO(RSRE(RTssrc(R R RURRR(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR@s8   cCsAtotiidnt||}|i||iS(Nsentering _resolveRelativeURIs (RRRRR@R,R3(t htmlSourcetbaseURIRR>((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR*7s  t_HTMLSanitizercGBseZdddddddddd d d d d ddddddddddddddddddd d!d"d#d$d%d&d'd(d)d*d+d,d-d.d/d0d1d2d3d4d5d6d7d8d9d:d;d<d=d>d?d@dAdBdCdDdEdFgGZddGdHdIdJdKdLdMdNdOdPdQdRdSdTd dUdVdWdXdYdZd[d\dd]d^d_d`dadbdcdddedfdgd'dhdidjdkdldmdndodpdqdrdsdtdudvdwdxdydzd{d|d6d}d~ddddddddddgGZddgZdZdZdZdZdZ dZ RS(RAtabbrtacronymtaddressRtbtbigRDR!tbuttontcaptiontcenterREtcodeR"tcolgrouptddRHtdfntdirtdivtdltdttemtfieldsettfontRIth1th2th3th4th5th6R$RR%R&RPtkbdRtlegendtliRqtmenutoltoptgrouptoptionR>tpreRSRutsamptselecttsmalltspantstriketstrongRytsupttablettbodyttdttextareattfoottthttheadttrttttutultvartacceptsaccept-charsett accesskeyRJtaligntalttaxistbordert cellpaddingt cellspacingtchartcharofftcharsettcheckedtclassR8tcolstcolspantcolortcompacttcoordstdatetimetdisabledtenctypetforR#theadersRR9threflangthspaceR0tismapRRKt maxlengthRRtmultipleRtnohreftnoshadetnowraptprompttreadonlyRtrevtrowstrowspantrulestscopetselectedtshapetsizeRtstartR7ttabindexttargetRRPROtvalignR[tvspaceRRTRBcCsti|d|_dS(Ni(RR,tunacceptablestack(RT((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR,Ts cCs||ijo(||ijo|id7_ndS|i|}g}|D]-\}}||ijo|||fqRqR~}ti|||dS(Ni(tacceptable_elementst"unacceptable_elements_with_end_tagRR4tacceptable_attributesRR(RTRWRRVRUR[((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRXsAcCsL||ijo(||ijo|id8_ndSti||dS(Ni(RRRRR(RTRW((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRas cCsdS(N((RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRhscCsdS(N((RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRkscCs"|ipti||ndS(N(RRR(RTR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRns ( R R RRRR,RRRRR(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRX=s2$      c st|}|i||i}tod}x~tD]v}yf|djo$ddklfd}Pn2|djo$ddkl fd}PnWq9q9Xq9W|ot |t dj}|o|i d }n||d d d d d ddd}|ot |d }n|i doD|idd d }|i do|idd d }q~n|i do|idd d}qqn|iidd}|S(NRi(t parseStringcst||S(N(R(R tkwargs(t_utidy(sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_tidy~sR(tTidycs"i||\}}}}|S(N(ttidy(R Rtnerrorst nwarningst errordata(t_mxtidy(sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRsusutf-8t output_xhtmlitnumeric_entitiestwrapit char_encodingtutf8sR Rttidy_interfaceR((RRsB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR,rsB      $"t_FeedURLHandlercBs>eZdZdZdZeZeZeZdZRS(cCs`|ddjo'|djo|i|||||Sti|||i}||_|S(Nidii0(thttp_error_302turllibt addinfourlt get_full_urltstatus(RTtreqtfpRatmsgRtinfourl((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pythttp_error_defaults  cCst|iido%tii||||||}nti|||i}t|dp ||_ n|S(NtlocationR( R*RSturllib2tHTTPRedirectHandlerRRRRR^R(RTRRRaRRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs % cCst|iido%tii||||||}nti|||i}t|dp ||_ n|S(NRR( R*RSRRthttp_error_301RRRR^R(RTRRRaRRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs % c Csti|id}ytiiddjpttdjptti|i diddid\}}t i d|dd} |i | ||||i d |||} |i| SWn|i|||||SXdS( Niis2.3.3t Authorizationt Rsrealm="([^"]*)"sWWW-Authenticateswww-authenticate(RzRRRRReRRiR#RRltfindallt add_passwordthttp_error_auth_reqedtreset_retry_countR( RTRRRaRRthosttusertpasswtrealmtretry((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pythttp_error_401s !2 ( R R RRRthttp_error_300thttp_error_303thttp_error_307R(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRs  c Cst|do|S|djotiSti|dd1jo|p t}nd0}toti|\}}ti |\} }| oLti | \} } | o,d|| |f}ti | i }qqnt i|} | id||o| id |n|od d d d dddg} ddddddddddddg } | idd| |d|d | |d!d!|d|d"|d#|d$fn|o| id%|ntoto| id&d'nGto| id&d(n,to| id&d)n| id&d*|o| id+d,|nto| id-tn| id.d/tt ittg|}g|_z|i| SWd0|iXnyt|SWnnXtt|S(2s8URL, filename, or string --> stream This function lets you define parsers that take any input source (URL, pathname to local or network file, or actual data as a string) and deal with it in a uniform manner. Returned object is guaranteed to have all the basic stdio read methods (read, readline, readlines). Just .close() the object when you're done with it. If the etag argument is supplied, it will be used as the value of an If-None-Match request header. If the modified argument is supplied, it must be a tuple of 9 integers as returned by gmtime() in the standard Python time module. This MUST be in GMT (Greenwich Mean Time). The formatted date/time will be used as the value of an If-Modified-Since request header. If the agent argument is supplied, it will be used as the value of a User-Agent request header. If the referrer argument is supplied, it will be used as the value of a Referer[sic] request header. If handlers is supplied, it is a list of handlers used to build a urllib2 opener. treadt-ithttpthttpstftps %s://%s%ss User-Agents If-None-MatchtMontTuetWedtThutFritSattSuntJantFebtMartAprtMaytJuntJultAugtSeptOcttNovtDecsIf-Modified-Sinces#%s, %02d %s %04d %02d:%02d:%02d GMTiiiiiitReferersAccept-encodings gzip, deflatetgziptdeflateRRsBasic %stAcceptsA-IMR,N(RRR(R^RtstdinRzt USER_AGENTRiRRt splittypet splithostt splitusert encodestringR"RtRequestt add_headerR tzlibt ACCEPT_HEADERtapplyt build_openerttupleRt addheaderstopentcloset _StringIOR(turl_file_stream_or_stringtetagR;tagenttreferrerthandlerstauthturltypetresttrealhostt user_passwdtrequesttshort_weekdaystmonthstopener((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_open_resources`  !*U" cCstid|dS(sLRegister a date handler function (takes string, returns 9-tuple date in GMT)iN(t_date_handlerstinsert(tfunc((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pytregisterDateHandler*ss YYYY-?MM-?DDsYYYY-MMs YYYY-?OOOs YY-?MM-?DDsYY-?OOOtYYYYs-YY-?MMs-OOOs-YYs--MM-?DDs--MMs---DDtCCs(?P\d{4})tYYs(?P\d\d)tMMs(?P[01]\d)tDDs(?P[0123]\d)tOOOs(?P[0123]\d\d)s(?P\d\d$)s$(T?(?P\d{2}):(?P\d{2})s(:(?P\d{2}))?s6(?P[+-](?P\d{2})(:(?P\d{2}))?|Z)?)?c Csd}x&tD]}||}|oPq q W|pdS|idjodS|i}|idd}|ot|}nd}|idd}| p |djotid}nLt|djo,dttiddt|}n t|}|idd }| p |d jo%|o d }q`tid }nt|}|id d}|ph|o |}q|id dp&|iddp|iddo d }qtid}n t|}d |i jo t|d d dd }nx>d ddddgD]'}|i|dpd||tweekdaytdaylight_savings_flagttmRA((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_iso8601Lsv   ,    &    "&"&u년u월u일u오전u오후s;(\d{4})%s\s+(\d{2})%s\s+(\d{2})%s\s+(\d{2}):(\d{2}):(\d{2})u>(\d{4})-(\d{2})-(\d{2})\s+(%s|%s)\s+(\d{,2}):(\d{,2}):(\d{,2})cCsti|}|pdSdh|idd6|idd6|idd6|id d 6|id d 6|id d6dd6}totiid|nt|S(s8Parse a string according to the OnBlog 8-bit date formatNsE%(year)s-%(month)s-%(day)sT%(hour)s:%(minute)s:%(second)s%(zonediff)siR8iR9iR:iR<iR=iR>s+09:00tzonediffsOnBlog date parsed as: %s (t_korean_onblog_date_reR.RmRRRRt_parse_date_w3dtf(RJR<t w3dtfdate((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_onblogs30 cCsti|}|pdSt|id}|id}|tjo|d7}nt|}t|djod|}ndh|idd6|id d 6|id d 6|d 6|idd6|idd6dd6}toti i d|nt |S(s6Parse a string according to the Nate 8-bit date formatNiii it0sE%(year)s-%(month)s-%(day)sT%(hour)s:%(minute)s:%(second)s%(zonediff)sR8iR9iR:R<iR=iR>s+09:00RRsNate date parsed as: %s ( t_korean_nate_date_reR.RRmt _korean_pmRRRRRRRT(RJR<R<tampmRU((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_nates"  3' s9(\d{4})-(\d{2})-(\d{2})\s+(\d{2}):(\d{2}):(\d{2})(\.\d+)?cCsti|}|pdSdh|idd6|idd6|idd6|id d 6|id d 6|id d6dd6}totiid|nt|S(s2Parse a string according to the MS SQL date formatNsE%(year)s-%(month)s-%(day)sT%(hour)s:%(minute)s:%(second)s%(zonediff)siR8iR9iR:iR<iR=iR>s+09:00RRsMS SQL date parsed as: %s (t_mssql_date_reR.RmRRRRRT(RJR<RU((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_mssqls30 uJanuΙανuFebuΦεβuMaruΜάώuΜαώuApruΑπρuMayuΜάιuΜαϊuΜαιuJunuΙούνuΙονuJuluΙούλuΙολuAuguΑύγuΑυγuSepuΣεπuOctuΟκτuNovuΝοέuΝοεuDecuΔεκuSunuΚυρuMonuΔευuTueuΤριuWeduΤετuThuuΠεμuFriuΠαρuSatuΣαβuL([^,]+),\s+(\d{2})\s+([^\s]+)\s+(\d{4})\s+(\d{2}):(\d{2}):(\d{2})\s+([^\s]+)cCsti|}|pdSy*t|id}t|id}WndSXdh|d6|idd6|d6|id d 6|id d 6|id d6|idd6|idd6}totiid|nt |S(s6Parse a string according to a Greek 8-bit date format.NiisP%(wday)s, %(day)s %(month)s %(year)s %(hour)s:%(minute)s:%(second)s %(zonediff)stwdayiR:R9iR8iR<iR=iR>iRRsGreek date parsed as: %s ( t_greek_date_format_reR.t _greek_wdaysRmt _greek_monthsRRRRt_parse_date_rfc822(RJR<R^R9t rfc822date((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_greeks10u01ujanuáru02u februáriu03umárciusu04uáprilisu05umáujusu06ujúniusu07ujúliusu08u augusztusu09u szeptemberu10uoktóberu11unovemberu12udecemberu?(\d{4})-([^-]+)-(\d{,2})T(\d{,2}):(\d{2})((\+|-)(\d{,2}:\d{2}))cCsti|}|pdSywt|id}|id}t|djod|}n|id}t|djod|}nWndSXdh|idd6|d 6|d 6|d 6|id d 6|idd6}totiid|nt |S(s:Parse a string according to a Hungarian 8-bit date format.NiiiRWis:%(year)s-%(month)s-%(day)sT%(hour)s:%(minute)s%(zonediff)sR8R9R:R<iR=iRRsHungarian date parsed as: %s ( t_hungarian_date_format_reR.t_hungarian_monthsRmRRRRRRT(RJR<R9R:R<RU((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parse_date_hungarian!s(!c Csd}d}d}d}d}ti|}d|}d||f}ti|} | i|} | djp| i|jodS|| || d } | ddjodStiti| || tiS( Nc Sst|id}|djo,dttiddt|}n|djod S|id}|ot|}|dd}|dd}d}x||joti|||ddddddf }ti|d}t||}||jo/||jo||}qw|d}d }q||jo-||d jo||}qw|d}qqW|||fS|id }d}|djo d}n9t|}|id }|ot|}nd}|||fS(NR8idiitjulianiiiiiR9R:(iii(RRmRFRGRiRItabs(R<R8RhR9R:tjdayRtdiff((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt__extract_date<sH ,   *        cSs|pdS|id}|pdSt|}t|id}|id}|ot|}nd}|||fS(Nithourstminutestseconds(iii(iii(RmR(R<RmRnRo((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt__extract_timees cSs|pdS|id}|pdS|djodSt|id}|id}|ot|}nd}|d|d}|ddjo| S|S(sAReturn the Time Zone Designator as an offset in seconds from UTC.ittzdRBttzdhourst tzdminutesi<RC(RmR(R<RqRmRntoffset((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt __extract_tzdts  sd(?P\d\d\d\d)(?:(?P-|)(?:(?P\d\d\d)|(?P\d\d)(?:(?P=dsep)(?P\d\d))?))?s;(?P[-+](?P\d\d)(?::?(?P\d\d))|Z)sW(?P\d\d)(?P:|)(?P\d\d)(?:(?P=tsep)(?P\d\d(?:[.,]\d+)?))?s %s(?:T%s)?i(iii( RlR2R.RiRmRFRGRIttimezone( RJRlRpRut __date_ret__tzd_ret__tzd_rxt __time_ret __datetime_ret __datetime_rxR<tgmt((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRT;s" )   cCs|i}|dddjp|ditijo |d=nt|djof|d}|id}|djo || ||dg|d)n|id d i|}nt|d jo|d 7}nti|}|ot i ti |Sd S(s8Parse an RFC822, RFC1123, RFC2822, or asctime-style dateiit,t.iiRCiRRis 00:00:00 GMTN(R~R( RRtrfc822t _daynamesRRRRpt parsedate_tzRFRGt mktime_tz(RJR RuRRP((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyRbs /     iptATi tETitCTiDtMTitPTcCsxtD]}yg||}|pwnt|djo%totiidntntt||SWqt j o7}to'tiid|i t |fqqXqWdS(s6Parses a variety of date formats into a 9-tuple in GMTi s*date handler function must return 9-tuple s %s raised %s N( R-RRRRRt ValueErrorRqRt ExceptionR RRi(RJR t date9tuplete((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR}s"   ' c Csd}d}d}d}||id\}}yj|d djot|}n.|d djo"d}t|did}nt|djoK|d d jo:|d d!d jo&d}t|d did}n|d d jo"d }t|d id}njt|djoK|d djo:|d d!d jo&d }t|d d id}n |d djo"d}t|did}n|d djo"d}t|did}n|d djo&d}t|ddid}no|d djo&d}t|ddid}n8|d djo&d}t|ddid}ntidi|}Wn d0}nX|o8|i di }|o|d1jo |}qnd} d2} d3} || jp |i d*o.|i d+od,} |p |pd}n|| jp |i d-o'|i d+od,} |pd.}nX|i d-o|pd.}n7|o"|i d o|pd/}n|pd}||||| fS(4s Get the character encoding of the XML document http_headers is a dictionary xml_data is a raw string (not Unicode) This is so much trickier than it sounds, it's not even funny. According to RFC 3023 ('XML Media Types'), if the HTTP Content-Type is application/xml, application/*+xml, application/xml-external-parsed-entity, or application/xml-dtd, the encoding given in the charset parameter of the HTTP Content-Type takes precedence over the encoding given in the XML prefix within the document, and defaults to 'utf-8' if neither are specified. But, if the HTTP Content-Type is text/xml, text/*+xml, or text/xml-external-parsed-entity, the encoding given in the XML prefix within the document is ALWAYS IGNORED and only the encoding given in the charset parameter of the HTTP Content-Type header should be respected, and it defaults to 'us-ascii' if not specified. Furthermore, discussion on the atom-syntax mailing list with the author of RFC 3023 leads me to the conclusion that any document served with a Content-Type of text/* and no charset parameter must be treated as us-ascii. (We now do this.) And also that it must always be flagged as non-well-formed. (We now do this too.) If Content-Type is unspecified (input was local file or non-HTTP source) or unrecognized (server just got it totally wrong), then go by the encoding given in the XML prefix of the document and default to 'iso-8859-1' as per the HTTP specification (RFC 2616). Then, assuming we didn't find a character encoding in the HTTP headers (and the HTTP Content-type allowed us to look in the body), we need to sniff the first few bytes of the XML data and try to determine whether the encoding is ASCII-compatible. Section F of the XML specification shows the way here: http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info If the sniffed encoding is not ASCII-compatible, we need to make it ASCII compatible so that we can sniff further into the XML declaration to find the encoding attribute, which will tell us the true encoding. Of course, none of this guarantees that we will be able to parse the feed in the declared character encoding (assuming it was declared correctly, which many are not). CJKCodecs and iconv_codec help a lot; you should definitely install them if you can. http://cjkpython.i18n.org/ cSsD|pd}ti|\}}||iddiddfS(s takes HTTP Content-Type header and returns (content type, charset) If no charset is specified, returns (content type, '') If no content type is specified, returns ('', '') Both return parameters are guaranteed to be lowercase strings RRR0(tcgit parse_headerROR(t content_typeRL((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_parseHTTPContentTypes Rs content-typeisLot<?sutf-16besutf-8ists<?sutf-16lestisiso-10646-ucs-2sucs-2t csunicodesiso-10646-ucs-4sucs-4tcsucs4sutf-16sutf-32tutf_16tutf_32tutf16tu16sapplication/xmlsapplication/xml-dtds&application/xml-external-parsed-entitystext/xmlstext/xml-external-parsed-entitys application/s+xmlistext/sus-asciis iso-8859-1N( siso-10646-ucs-2sucs-2Rsiso-10646-ucs-4sucs-4Rsutf-16sutf-32sutf_16sutf_32sutf16su16(sapplication/xmlsapplication/xml-dtds&application/xml-external-parsed-entity(stext/xmlstext/xml-external-parsed-entity(RORwR-RRRlR2R.RitgroupsRRdRRS( t http_headerstxml_dataRtsniffed_xml_encodingt xml_encodingt true_encodingthttp_content_typet http_encodingtxml_encoding_matchtacceptable_content_typetapplication_content_typesttext_content_types((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_getCharacterEncodingsv0 8 8          c Cstotiid|nt|djou|d djod|dd!djoPto5tiid|djotiidqnd}|d}nt|djou|d d jod|dd!djoPto5tiid|d jotiid qnd }|d}n$|d d joPto5tiid|djotiidq|nd}|d }n|d djoPto5tiid|djotiidqnd}|d}nb|d djoPto5tiid|djotiidq>nd}|d}nt||}totiid|ntid}d}|i|o|i ||}n|d|}|i dS(sChanges an XML data stream on the fly to specify a new encoding data is a raw sequence of bytes (not Unicode) that is presumed to be in %encoding already encoding is a string recognized by encodings.aliases s%entering _toUTF8, trying encoding %s iisRsstripping BOM sutf-16bestrying utf-16be instead ssutf-16lestrying utf-16le instead issutf-8strying utf-8 instead Rsutf-32bestrying utf-32be instead ssutf-32lestrying utf-32le instead s*successfully converted %s data to unicode s^<\?xml[^>]*?>s&u ( RRRRRR-RlR2RRyR(R Rtnewdatat declmatchtnewdecl((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt_toUTF8J s\8 8    cCstidti}|id|}tidti}|i|}|o |dpd}|iido d}nd}|id|}||fS(sStrips DOCTYPE from XML document, returns (rss_version, stripped_data) rss_version may be 'rss091n' or None stripped_data is the same XML document, minus the DOCTYPE s]*?)>Rs]*?)>itnetscapeRN(RlR2t MULTILINERyRRRRi(R tentity_patterntdoctype_patterntdoctype_resultstdoctypeR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt _stripDoctype s cCst}t|dy0d"}|i|t||}d}}WqqXn| oKd#|jo>y0d#}|i|t||}d}}WqcqcXn|p:d|d|d<|i?|d-<|S(.s0Parse a feed from a URL, file, stream, or stringR,R.iR itbozo_exceptionRRscontent-encodingR tfileobjR RtETagRs Last-ModifiedR;R:R9iRRRs content-types%s is not an XML media typesno Content-type specifiedRscontent-locationscontent-languagei0s1The feed has not changed since you last checked, s:so the server sent no data. This is a feature, not a bug!t debug_messagesutf-8s windows-1252s#document encoding unknown, I tried s2%s, %s, utf-8, and windows-1252 but nothing workeds+documented declared as %s, but parsed as %st _ns_stackRs$http://www.w3.org/XML/1998/namespaceiNsxml parsing failed R(@R+t_XML_AVAILABLERPRQt InstanceTypeR,RRRiR^R RROtGzipFileRRt decompresst MAX_WBITSRt getheaderR}R:RR*RRRSRRRRtchardettdetectRRRRR t make_parsertPREFERRED_XML_PARSERSt setFeatureR tfeature_namespacestsetContentHandlertsetErrorHandlert xmlreadert InputSourcet setByteStreamRtparseRt tracebackt print_stackt print_excRRRR R?R,RR.RR(RRR;R R!R"tresulttfR RRt last_modifiedRRRRRt bozo_messageRRtuse_strict_parsertknown_encodingttried_encodingstproposed_encodingt feedparsert saxparserRR((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyR s(       #"  #    "                      t__main__(tpprint(t__doc__t __version__t __license__t __author__t__contributors__RRRRRRR*RlRR.RzRFRRQRRRt cStringIORRR RiRtxml.saxRR Rtxml.sax.saxutilsRR RRR$tcjkcodecs.aliasest cjkcodecst iconv_codecRtchardet.constantst constantsRR RRRRR2ttagfindtspecialtcharreftSUPPORTED_VERSIONSR*R%t NameErrorR+RlRmRwRxR~RR R RR+RR?R@R*RXR,tHTTPDigestAuthHandlerRtHTTPDefaultErrorHandlerRR,R-R0t _iso8601_tmplRVttmplRt _iso8601_reRtregexR.RDRQt _korean_yeart _korean_montht _korean_dayt _korean_amRYRSRXRVR[R\R]RaR`R_RdRfReRgRTRbt_additional_timezonest _timezonesRjR}RRRRR targvtexitturlsRR:R(((sB/afs/athena.mit.edu/user/w/d/wdaher/Public/rssZephyr/feedparser.pyt s               K    &Hs' 5 '%6 Z     n- N               ]  )   5