So at work today I needed to revisit a regular expression that I often need, but often can't get to work right. It looks like this when done correctly.
It says give me everything between a set of quotes. For instance the string [ "this is" not captured "but this is" ] returns a list that has "this is" and "but this is" in it.
I was going about it all wrong trying to make .* work for me but quantifiers are greedy. I heavily thank this</a> page for showing a correct example (see the Pitfalls section)
In other news I had to do some reverse name lookups with a lot of IPs recently. I had a script that did it sequentially, where I would give it roughly 50,000 addresses and it would take a bit long to resolve them. Well, I had to do upwards of 6 million an hour, and 50k here and there just wasn't cutting it.
I thought about maybe forking multiple processes of 50k, but that because error prone because each of these IPs is a unique record, so I would have needed to select 50k or so, mark them as being "in progress" and then move on. They're not in a format that has any sort of indexes on them for quick selection, so to sequentially scan that many would take a while and I would very likely be repeating many resolutions.
Anyways, I looked around Google and found a blog</a> where a Michael Schurter had been faced with a similar (albeit smaller) problem. He had to do around 3000. I needed to do around 100,000 every couple seconds.
I used his examples and got it working for my own data. I was doing reverse lookups is the only real difference. I must say, python-adns does exactly what it is advertised to do. My results closely mirror his. I resolved ~660,000 IPs to their DNS name in ~2 min and 30 seconds. Compare that to the multiple hours it _was_ taking going at 50,000 every so often.
It's so fast/parallel that I brought the DNS server down, so we resorted to using a local caching server on the host doing the resolving. That made things sane again. Well, I just wanted to plug python-adns and Mike for having a great post that really helped me out.
- regular expression tutorial</a></li>
- php preg_match tester</a></li>
- base64 decoder</a></li>
- adns python module C code</a></li>
- developing a php extension for skype</a></li>
- the daily plate</a></li>
- get around firewall filtering with tsocks</a></li>
- 7 habits of highly effective freelance programmers</a></li>
- dive into greasemonkey - hello world</a></li>
- dancing monkeys tutorial</a></li>