StringRegExp Probleme

  • Hey Leute ich bin gerade dabei einen Phraser für html source zu bauen,

    jedoch hab ich noch erhebliche probleme mit StringRegExp...

    hier mal 2 Beispiele:

    hab hier diesen code:

    PHP
    <p><b>1 - 7</b> van 7 resultaten <i>(0 s)</i></p>

    und möchte daraus das "1 - 7" und "7" haben, jedoch ändern sich diese werte auch immer,
    desweiteren ist der code nicht gerade so unhäufig im quelltext, hab es mit "<p><b>1 -" bis
    "</b> van" probiert um die erste 7 zu bekommen und "/b> van" und " resultaten" um die
    zweite zu bekommen, den rest hätt ich dann mit stringreplace rausgefiltert... aber funzt nicht


    2tes problem ist etwas größer, ich poste euch erstmal den quelltext um den es geht:

    Spoiler anzeigen
    [autoit]


    <table>
    <thead>
    <tr>
    <td class="nowrap" colspan="2">
    <a href="http://www.nzbindex.nl/rss/?q=linux+ubuntu&sort=agedesc&minsize=200&complete=1&max=250&more=1" target="_blank"><img src="http://www.nzbindex.nl/template/nzbindex/images/rss.png" alt="RSS" title="RSS"> RSS</a>
    </td>
    <td align="right" class="nowrap" colspan="3">
    <p><b>1 - 7</b> van 7 resultaten <i>(0 s)</i></p>
    </td>
    </tr>
    <tr class="line">
    <td class="nowrap" colspan="2">
    <input type="submit" value="Maak NZB">
    <input type="button" value="Selecteer alles" onclick="checkAll('results'); return false;">
    <input type="button" value="Omkeren" onclick="checkInverse('results'); return false;">
    <input type="button" value="Permalink" onclick="if(!permalink('results', 'http://www.nzbindex.nl/makepermalink/')) alert('Selecteer minstens een resultaat'); return false;">
    <input type="hidden" name="n" value="linux ubuntu">
    &nbsp;&nbsp;<a href="http://www.nzbindex.nl/help/results/nederlands/" class="nowrap"><img src="http://www.nzbindex.nl/template/nzbindex/images/help_form.gif" alt="Help" title="Help"></a>
    </td>
    <td width="100" align="right">
    <h2>Grootte</h2>
    </td>
    <td width="150">
    <h2>Groep</h2>
    </td>
    <td width="50" align="right">
    <h2>Leeftijd</h2>
    </td>
    </tr>
    </thead>
    <tbody>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box8165018" name="r[]" value="8165018" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box8165018"><span class="highlight"><wbr/>Linux</span> <wbr/>X86 <span class="highlight"><wbr/>Ubuntu</span> 8<wbr/>.04 <wbr/>LTS <wbr/>Hardy <wbr/>Heron (<wbr/>Alle talen) <wbr/>-=<wbr/>- kijk ook eens op twilightnzb<wbr/>.com[00/51] <wbr/>- &quot;<wbr/>Install<wbr/>-<wbr/>Live<wbr/>-386<wbr/>-<wbr/>DVD.nzb&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">41 bestanden (15673 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=Powered+by+Newsstars.nl+(Twilightnzb)&more=1"><wbr/>Powered by <wbr/>Newsstars<wbr/>.nl (<wbr/>Twilightnzb)</a></span>
    <div class="fileinfo">
    1 PAR2 | 1 NZB | 38 ARCHIEF</div>
    <div>
    <a href="http://www.nzbindex.nl/download/8165018-1239011061/Linux-X86-Ubuntu-8.04-LTS-Hardy-Heron-Alle-talen-kijk-ook-eens-op-twilightnzb.com0051-Install-Live-386-DVD.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/8165018/Linux-X86-Ubuntu-8.04-LTS-Hardy-Heron-Alle-talen-kijk-ook-eens-op-twilightnzb.com0051-Install-Live-386-DVD.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    3.74 GB</td>
    <td class="nowrap">
    a.b.nl<br/></td>
    <td class="nowrap" align="right">
    13.7 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box7904393" name="r[]" value="7904393" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box7904393"><span class="highlight"><wbr/>Linux</span> <span class="highlight"><wbr/>Ubuntu</span> <wbr/>Ultimate <wbr/>Edition 2<wbr/>.0 <wbr/>- <wbr/>Gamers (32bit) <wbr/>-=<wbr/>- kijk ook eens op twilightnzb<wbr/>.com[85/95] <wbr/>- &quot;<span class="highlight"><wbr/>LINUX</span><wbr/>_<wbr/>OS-<span class="highlight"><wbr/>Ubuntu</span><wbr/>-<wbr/>Ultimate<wbr/>_<wbr/>Edition<wbr/>_2<wbr/>.0<wbr/>-<wbr/>Gamers<wbr/>-<wbr/>_x86<wbr/>_32b<wbr/>_<wbr/>ISO.par2[1]<wbr/>.vol00<wbr/>+01<wbr/>.<wbr/>PAR2&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">11 bestanden (1831 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=Powered+by+Newsstars.nl+(Twilightnzb)&more=1"><wbr/>Powered by <wbr/>Newsstars<wbr/>.nl (<wbr/>Twilightnzb)</a></span>
    <div class="fileinfo">
    11 PAR2</div>
    <div>
    <a href="http://www.nzbindex.nl/download/7904393-1239011061/Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit-kijk-ook-eens-op-twilightnzb.com8595-LINUX-OS-Ubuntu-Ultimate-Edition-2.0-Gamers-x86-32b-ISO.pa.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/7904393/Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit-kijk-ook-eens-op-twilightnzb.com8595-LINUX-OS-Ubuntu-Ultimate-Edition-2.0-Gamers-x86-32b-ISO.pa.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    447.45 MB</td>
    <td class="nowrap">
    a.b.nl<br/></td>
    <td class="nowrap" align="right">
    22.0 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box7904093" name="r[]" value="7904093" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box7904093"><span class="highlight"><wbr/>Linux</span> <span class="highlight"><wbr/>Ubuntu</span> <wbr/>Ultimate <wbr/>Edition 2<wbr/>.0 <wbr/>- <wbr/>Gamers (32bit) <wbr/>-=<wbr/>- kijk ook eens op twilightnzb<wbr/>.com[00/95] <wbr/>- &quot;<span class="highlight"><wbr/>Linux</span> <span class="highlight"><wbr/>Ubuntu</span> <wbr/>Ultimate <wbr/>Edition 2<wbr/>.0 <wbr/>- <wbr/>Gamers (32bit)<wbr/>.nzb&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">83 bestanden (17087 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=Powered+by+Newsstars.nl+(Twilightnzb)&more=1"><wbr/>Powered by <wbr/>Newsstars<wbr/>.nl (<wbr/>Twilightnzb)</a></span>
    <div class="fileinfo">
    1 NFO | 1 NZB | 81 ARCHIEF - <a href="javascript:nfo('http://www.nzbindex.nl/nfo/7904093/Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit-kijk-ook-eens-op-twilightnzb.com0095-Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit.nzb/?q=');">Toon NFO</a> </div>
    <div>
    <a href="http://www.nzbindex.nl/download/7904093-1239011061/Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit-kijk-ook-eens-op-twilightnzb.com0095-Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/7904093/Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit-kijk-ook-eens-op-twilightnzb.com0095-Linux-Ubuntu-Ultimate-Edition-2.0-Gamers-32bit.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    4.07 GB</td>
    <td class="nowrap">
    a.b.nl<br/></td>
    <td class="nowrap" align="right">
    22.0 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box6640632" name="r[]" value="6640632" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box6640632">&lt;<wbr/>UHQ&gt;&lt;<span class="highlight"><wbr/>Ubuntu</span><wbr/>.<wbr/>VTC.<wbr/>Trainings<wbr/>.<wbr/>CD&gt; [1/8] <wbr/>- &quot;<span class="highlight"><wbr/>Ubuntu</span><wbr/>.<span class="highlight"><wbr/>Linux</span><wbr/>.<wbr/>VTC.<wbr/>Training<wbr/>CD <wbr/>.rar&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">1 bestand (625 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=UHQ@USENETHQ.ORG+(UHQ)&more=1"><wbr/>U<wbr/>H<wbr/>Q@<wbr/>U<wbr/>S<wbr/>E<wbr/>N<wbr/>E<wbr/>T<wbr/>H<wbr/>Q<wbr/>.<wbr/>O<wbr/>R<wbr/>G (<wbr/>U<wbr/>H<wbr/>Q)</a></span>
    <div>
    <a href="http://www.nzbindex.nl/download/6640632-1239011061/UHQUbuntu.VTC.Trainings.CD-18-Ubuntu.Linux.VTC.TrainingCD-.rar.nzb">Download</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    234.94 MB</td>
    <td class="nowrap">
    a.b.hou<br/>a.b.mom<br/>a.b.uhq<br/></td>
    <td class="nowrap" align="right">
    53.4 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box5172696" name="r[]" value="5172696" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box5172696"><span class="highlight">ubuntu</span><wbr/>-8<wbr/>.04<wbr/>.1<wbr/>-desktop<wbr/>-i386<wbr/>.iso unbunto <span class="highlight">linux</span></label>
    <div class="info">
    <span class="complete">2 bestanden (2919 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=Yenc@power-post.org+(Yenc-PP-A&A)&more=1"><wbr/>Yenc@power<wbr/>-post<wbr/>.org (<wbr/>Yenc<wbr/>-<wbr/>P<wbr/>P<wbr/>-<wbr/>A&amp;<wbr/>A)</a></span>
    <div class="fileinfo">
    1 PAR2</div>
    <div>
    <a href="http://www.nzbindex.nl/download/5172696-1239011061/ubuntu-8.04.1-desktop-i386.iso-unbunto-linux.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/5172696/ubuntu-8.04.1-desktop-i386.iso-unbunto-linux.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    716.89 MB</td>
    <td class="nowrap">
    a.b.boneless<br/></td>
    <td class="nowrap" align="right">
    197.7 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box3942814" name="r[]" value="3942814" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box3942814">(????) [01/16] <wbr/>- &quot;<span class="highlight">ubuntu</span><wbr/>-8<wbr/>.04<wbr/>.1<wbr/>-desktop<wbr/>-amd64<wbr/>.par2 <span class="highlight">linux</span> <span class="highlight">ubuntu</span> 64bit by henk1001&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">16 bestanden (2097 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=CPP-[email='gebruiker@domein.nl'][/email]+(CPP-Gebruiker)&more=1"><wbr/>C<wbr/>P<wbr/>P<wbr/>-gebruiker@domein<wbr/>.nl (<wbr/>C<wbr/>P<wbr/>P<wbr/>-<wbr/>Gebruiker)</a></span>
    <div class="fileinfo">
    9 PAR2 | 7 ARCHIEF</div>
    <div>
    <a href="http://www.nzbindex.nl/download/3942814-1239011061/0116-ubuntu-8.04.1-desktop-amd64.par2-linux-ubuntu-64bit-by-henk1001.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/3942814/0116-ubuntu-8.04.1-desktop-amd64.par2-linux-ubuntu-64bit-by-henk1001.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    785.73 MB</td>
    <td class="nowrap">
    a.b.nl<br/></td>
    <td class="nowrap" align="right">
    239.0 dagen</td>
    </tr>
    <tr>
    <td class="nowrap firstcolumn" width="10" align="center">
    <input type="checkbox" id="box3927228" name="r[]" value="3927228" onclick="shiftclick(arguments[0]);">
    </td>
    <td width="100%">
    <label for="box3927228">(????) [01/11] <wbr/>- &quot;<span class="highlight">ubuntu</span><wbr/>-8<wbr/>.04<wbr/>.1<wbr/>-desktop<wbr/>-i386<wbr/>.par2<span class="highlight">linux</span> <span class="highlight">ubuntu</span> 32bit by henk1001&quot; y<wbr/>Enc</label>
    <div class="info">
    <span class="complete">11 bestanden (2056 delen)</span>
    <span class="poster">door <a href="http://www.nzbindex.nl//search/?poster=CPP-[email='gebruiker@domein.nl'][/email]+(CPP-Gebruiker)&more=1"><wbr/>C<wbr/>P<wbr/>P<wbr/>-gebruiker@domein<wbr/>.nl (<wbr/>C<wbr/>P<wbr/>P<wbr/>-<wbr/>Gebruiker)</a></span>
    <div>
    <a href="http://www.nzbindex.nl/download/3927228-1239011061/0111-ubuntu-8.04.1-desktop-i386.par2linux-ubuntu-32bit-by-henk1001.nzb">Download</a>
    - <a href="http://www.nzbindex.nl/release/3927228/0111-ubuntu-8.04.1-desktop-i386.par2linux-ubuntu-32bit-by-henk1001.nzb">Toon alle bestanden</a>
    </div>
    </div>
    </td>
    <td class="nowrap" align="right">
    772.78 MB</td>
    <td class="nowrap">
    a.b.nl<br/></td>
    <td class="nowrap" align="right">
    239.7 dagen</td>
    </tr>
    </tbody>
    <tfoot>
    <tr class="line">
    <td class="nowrap" colspan="5">
    <input type="submit" value="Maak NZB">
    <input type="button" value="Selecteer alles" onclick="checkAll('results');">
    <input type="button" value="Omkeren" onclick="checkInverse('results');">
    <input type="button" value="Permalink">
    </td>
    </tr>
    <tr>
    <td align="center" class="nowrap" colspan="5">
    </td>
    </tr>
    </tfoot>
    </table>

    [/autoit]

    ich brauche aus diesem quellcode, jeweils diesen code:

    PHP
    <label for="**********">BLABLABLA</label>

    und "<span class="complete">*** bestanden (****** delen)</span>"
    und "** PAR2 | * NZB | ** ARCHIEF"
    außerdem noch den download link und die größen angaben

    ist ziemlich viel auf einmal ich weiß, aber wenn ihr mir für eines
    ein beispiel geben könntet und ein paar mehr infos zu stringregexp
    damit ich diese funktion endlich mal richitg verstehe krieg ich den rest
    sicher selber hin....

    PS: hab schon alle möglichen themen hier und über google zu stringregexp
    druchgelesen aber irgendwie haperts bei mir mit dem verständniss


    gruß

    nova

  • okay problem 1 gelöst, das 7 von 7 hab ich mit stringinstr rausbekommen:

    Spoiler anzeigen
    [autoit]


    Func _getresultnumber($source)


    $result = StringInStr ( $source, 'alt="RSS" title="RSS"> RSS</a>' )
    $result2 = StringTrimLeft ( $source, $result +175 )
    $result3 = StringInStr ( $result2, "resultaten", 0, -1 )
    $result4 = StringLen ( $result2 )
    $result5 = StringTrimRight ( $result2, $result4 - $result3 +2 )
    $result6 = StringReplace ( $result5, "</b> van", " von" )

    $resultnumber = $result6

    Return $resultnumber
    EndFunc

    [/autoit]

    das funktioniert schonmal :)

    aber den rest mit stringinstr und trimmen zu machen wäre ja mörderisch
    da brauch ich auf jeden fall ne stringregexp lösung irgendwie... ich probiers
    mal weiter

    gruß

    nova

  • okay hab alles geschafft soweit, das einzige was ich nicht hinbekomme ist:

    PHP
    </td>
    				<td class="nowrap" align="right">
    					3.74 GB				</td>
    				<td class="nowrap">
    					a.b.nl<br/>				</td>
    				<td class="nowrap" align="right">
    					13.7 dagen				</td>
    			</tr>

    diese 3 angaben aus der tabelle zu lesen... vllt liegts am zeilenumbruch ich weiß nicht

    hat jemand ne idee??

    gruß

    nova

  • [autoit]

    #include <array.au3>
    $str = '</td>' & @CRLF & _
    '<td class="nowrap" align="right">' & @CRLF & _
    '3.74 GB </td>' & @CRLF & _
    '<td class="nowrap">' & @CRLF & _
    'a.b.nl<br/> </td>' & @CRLF & _
    '<td class="nowrap" align="right">' & @CRLF & _
    '13.7 dagen </td>' & @CRLF & _
    '</tr>'
    $regexp = StringRegExp($str, "<td.+>\r\n(.+)</td>", 3)
    _ArrayDisplay($regexp)

    [/autoit]