what savory currently knows about urls

February 10th, 2010

You could go on and on and on collecting data about any particular URL. The amount of contextual data relevant to these things is near limitless, especially when they are involved in malware distribution or other hosting nastiness.

Savory includes somewhere around 100 or so different bits of information for each URL it records. More information is added when new, reputable, data sources are found on the net. savory’s design makes it easy to extend the list of information about any particular URL, so as more new sources of context are uncovered, they are easily added to new or existing URLs.

So what does savory currently know?

I picked out a recent URL that was reported as having been defaced recently. Here’s what savory will show you when you view the general information about the URL.

Remember that almost all of this information is retrieved automatically. There is very little user-interaction with savory aside from just seeing “what’s there”

The first tab here shows the URL in question and the value of the <title> tag at the time it was submitted (or if no title was found, a copy of the URL)

In addition to that are a list of notes and tags. Notes can be added by automated processes or human processes to log actions or info about the URL. Tags can be used to allow future narrowing of search results, or the building of various forms of metrics or statistics.

From here we can move on to the “meat and potatoes” of the URI context; the document.

The document contains the majority of the URL information and it can be big, so I’ll show it in pieces. First, the “document details”. This information is specific to the document itself, not to the URL which the document is associated with.

Each document has a unique ID; the Document ID. This is a UUID that can be used to directly pull the full document details out of savory. This ID is used in the context of the backend document database. Savory also uses a relational database which includes a “savory ID”. These two IDs are different, but stored in both databases so that tools that use one database can also access the other database if they need to.

Each time savory updates the details of a URL’s document, it creates a revision of the document. Using the drop down menu, admins can view the context of the URL as it has changed over time and new fields have been added, removed, or changed. It should be noted that the database can, and is, compressed periodically. The end result of this is that old revisions are lost. This is not a particularly big concern for us though because at this time we are not as concerned with document history as we are with what is current.

Moving on down the page we get to the URI context

So here is what savory knows. This document has 72 fields of context (we’ll see more in a second).

You can see some fields that may be particularly useful to you as a human, but we also include information that is more useful to robots, scripts, or other automated tools.

Moving on, here is more context.

Among other things, savory GeoIP maps the IP address of the URL in question, so you can map the location of the URL.

Another automated process that savory runs is page sourcing. Often we’re interested to know what sort of badware is actually hosted at the URL in question. Sometimes it’s a script, other time’s it’s an exe, or a malformed webpage, you get the idea.

We use resources that we get from page sourcing to re-inforce our training and awareness programs at work. We’ve found admins to be much more interested in seeing the actual crap that is used to infect their machines than to just be told about it.

Part of the page sourcing is hashing of the content, show in the next couple pics

Alright, I can here you asking yourself “isn’t that a little excessive? Do you really need that many hashes?”

You’d be surprised how often we come across a new tool that finds some new weird way of doing things. Most of the time you’ll see MD5 or some form of SHA used by these tools but then you’ll find a tool that doesn’t. We include all the hashes we can for the URL to ensure that we don’t have to worry about the next new-fangled hash that people start using.

Finally, a little bit more information after the hashes.

You get the idea.

savory includes one more stub of information on the document page. I will grant you that it is probably useless information. However, it makes for stimulating discussions between upper management during their multiple hour long meetings. So in that case, we refer to the following data as “printing money”; completely irrelevant to the Techs, only cared for by the Suits.

Management like pictures of things, especially ones that look alien like the above QR code. When you take a bunch of these and put them on a single page, people begin to ask questions and your applications get more air-time among the population. QR codes are almost entirely useless in the context of this application. But we did it to scratch an itch (ok I was just wasting time and having fun with technology I admit it)

But it does make for an interesting conversation starter.

savory and tags

February 7th, 2010

Tags are used in a number of different contexts on the internet today. For example, social bookmarking websites, image gallery websites, blogs, etc.

Tagging provides savory users with a way to ask savory to return a given list of URLs that meet a semi specific list of requirements. The number of tags a URL can have associated with it is arbitrary. In practice, there are often a lot of tags

savory gives you two primary views into your tag data; table and cloud.

In my own opinion, the table view is more functional. The cloud view can often look very messy due to how many tags exist in the database.

In the table view, savory lists all the tags. Each set of tags has a header. In the image above, the header is “1″. This means that all the tags in this group begin with the number “1″. Each tag is a hyperlink that will take you to the URL searching page, filtered by URLs that are tagged with the selected tag. Next to each hyperlink is a count of the number of URLs that have that particular tag applied to them.

Only ~50 tags are displayed for each block. If you want to view all the tags for a given block, there is a “show all” link at the bottom right of the block

Clicking this link will display a modal dialog window which you can scroll through to find a listing of all the tags for that particular header.

The tag cloud view is a very busy view. It functions in nearly the same way as the table view though. The tags are clickable but there are no URL counts next to them. Instead, the size of the tag word denotes how many URLs are associated with it. The more URLs tagged with that tag, the bigger the word.

Finally, on the tag page, you have the ability to filter tags by strings found in those tags. This is essential for searching because too many tags make your eyes glaze over. savory can filter out the cruft for you if you have a general idea of what you’re looking for.

That wraps up savory’s view into tags and how you can use them to gauge the usage of the system. More articles to come that describe yet more features of savory that make it awesome.

savory’s url view

February 6th, 2010

savory’s primary view into the URL database is the URL module. Here’s a snapshot of what a portion of it looks like

From this view you can move back and forth through the url list. Some pieces of information about each URL are also displayed to the end user. In the above screenshot you can see

  • Snapshot of the webpage at the time it was recorded by savory
  • URL of the malware
  • Account which inserted the malware into savory (the bold text after “via”)
  • The time that the url was inserted
  • The tags that are associated with the URL. These include tags specified by the source doing the insert as well as auto-tagging functionality built into savory
  • A summary list of tags. The list is interpreted as all tags on the current page you’re viewing and the number of urls that have that tag.

Some other tidbits that you can see are the links next to each tag in the tag summary.

By clicking the “+” sign, you append that tag to your search criteria, narrowing the list of URLs returned by savory. By clicking on the name of the tag, you clear out your current search criteria and start with just that one tag

As you narrow the tag criteria, the interface changes as shown below.

Selected tags are highlighted and moved to the end of the line. You’ll also notice that the number of urls savory returned is substantially less (only a single URL).

Note though that the tag summary to the right has not changed much. What the tag summary is telling you is that there are 5 URLs tagged “ssdfsdf…”, 5169 URLs tagged “unknown_html”, etc.

These numbers will likely exceed the total number of URLs in savory because multiple tags can be applied to any given URL. This summary is meant to be a gauge to the admin of how general or specific a specific tag is in relation to the system as a whole.

savory gets screenshots

February 5th, 2010

Today I added UI components to savory for taking or re-taking screenshots.

When savory is buzzing along doing its thing, the maintainers of the system shouldn’t be concerned with taking screenshots of the malware URLs. When things break though, or when the maintainers wish to retake screenshots, they need to have a component in the UI to do so.

As of this writing, it looks like this

Well, that’s what you see when the screenshot has not been taken yet, or was tried and failed.

If you click the big camera in the middle of the screen, you can ask savory to re-take a snapshot of the webpage. This will open up a modal window in your browser with a progress indicator. When the snapshot process completes (and if it’s successful) you’ll be presented with your new screenshot.

You can then tell savory to save the new screenshot back to it’s database for future use.

Dad’s birthday

January 28th, 2010

Is this weekend and he is doing this guest chef thing at a local restaurant (god help us all)

Twinkie tools

January 14th, 2010

Since I’m a sucker for baking, I think it’s obvious that I need to buy this twinkie making tool. I’ve borrowed the two cats that lived at my mom’s house (this was her suggestion not mine) and they’ve been ok now for about 2 weeks. They’re both quite overweight because my mom feeds them too much so it was quite a shock when they came to live with me.

Now they eat only twice a day and the pounds are starting to come off. I don’t look forward to them going back home though because they’ll likely be ruined by my mom again.

Evaluating new tech

January 2nd, 2010

For a project I’m working on I was evaluating some new tech and of course during the evaluation I was lead to some more new tech that is related to this other new tech. In a nutshell, now I am in over my head :P

Christmas was no-too-bad this year. Surprisingly my biggest fear (that being in the same house with my family for > 4 hours would result in the end of days) didn’t happen.

My dad was surprisingly non judgemental and wasn’t bossy (shocking!). My mom wasn’t needy or asking for everyone to drop everything they are doing and bow to her; it was a joy.

I would have appreciated just one thing from each of them; christmas lists. Trying to get xmas lists from either of them is like pulling teeth.

I’m quite open to buying them whatever they’d ask for (within reason). I don’t judge in this regard. If you want weird shit, ask and you shall receive; who am I to judge. But they ended up asking for just about nothing and that was a bummer.

The new year is almost here though and I imagine I’ll be at my mom’s place for it. See you next year.

Wasting time

December 21st, 2009

Not much going on; need to clean out tabs

New job; play with toys

December 20th, 2009

My friend Will stirred my curiosity in random programming tasks today insofar as I added a tweets of new malware urls to our work twitter account. So basically what happens is savory, the malware app, will tweet when it sees new urls. Being a really weird person, you can choose to follow these tweets if you want.

I can’t imagine that this is useful in any way, but it was fun and I wrote a new class, Zend_Services_Bitly in the process.

I also gained a list of all the Nessus XML-RPC endpoints. Tenable hasn’t released any official documentation on this yet. In fact, I haven’t found anything anywhere about it yet; makes it kinda hard to use it, ya know. Anyway, here they are refer to my other post for various required inputs.

  • https://localhost:443/login
  • https://localhost:443/logout
  • https://localhost:443/users/add
  • https://localhost:443/users/delete
  • https://localhost:443/users/chpasswd
  • https://localhost/users/list
  • https://localhost:443/plugins/description
  • https://localhost/plugins/list
  • https://localhost:443/plugins/list/family
  • https://localhost/plugins/preferences
  • https://localhost/preferences/list
  • https://localhost/policy/list
  • https://localhost:443/policy/add/
  • https://localhost:443/policy/delete/
  • https://localhost:443/policy/rename/
  • https://localhost:443/scan/new/
  • https://localhost:443/scan/stop/
  • https://localhost:443/scan/pause/
  • https://localhost:443/scan/resume/
  • https://localhost/scan/list
  • https://localhost/report/list
  • https://localhost:443/report/delete
  • https://localhost:443/file/report/download
  • https://localhost:443/report/hosts
  • https://localhost:443/report/hosts
  • https://localhost:443/report/ports
  • https://localhost:443/report/details
  • https://localhost:443/report/tags
  • https://localhost:443/file/report/import

Documentation lacking

December 9th, 2009

Since I was unable to find any documentation concerning the XML-RPC endpoints for the new Nessus server, I ran it through Tamper Data and here’s what I got (not everything, but a good chunk)

URL=https://172.16.1.175:8834/login
POSTDATA =seq=1505&password=MyPassword&login=tim
URL=https://172.16.1.175:8834/plugins/list
URL=https://172.16.1.175:8834/report/list
URL=https://172.16.1.175:8834/policy/list
URL=https://172.16.1.175:8834/plugins/preferences
URL=https://172.16.1.175:8834/preferences/list
URL=https://172.16.1.175:8834/policy/add
POSTDATA =plugin%5Fselection%2Efamily%2EWindows%20%3A%20Microsoft%20Bulletins=enabled&Global%20variable%20settings%5Bfile%5D%3ASSL%20key%20to%20use%20%3A=

Basically it’s just everything you’d read from the XML file url-encoded and send in one gigantic post

URL=https://172.16.1.175:8834/scan/list
URL=https://172.16.1.175:8834/scan/new
POSTDATA =seq=2145&target=172%2E16%2E1%2E101&scan%5Fname=265&policy%5Fid=1
URL=https://172.16.1.175:8834/scan/pause
POSTDATA =seq=3587&scan%5Fuuid=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1
URL=https://172.16.1.175:8834/scan/resume
POSTDATA =seq=9975&scan%5Fuuid=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1
URL=https://172.16.1.175:8834/report/hosts
POSTDATA =report=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1&seq=5458
URL=https://172.16.1.175:8834/report/tags
ditto
URL=https://172.16.1.175:8834/report/ports
ditto
URL=https://172.16.1.175:8834/users/list
URL=https://172.16.1.175:8834/users/edit
POSTDATA =seq=3802&admin=1&login=tim
URL=https://172.16.1.175:8834/report/hosts
POSTDATA =report=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1&filter%2E0%2Efilter=plugin%5Fid&seq=7176&filter%2E0%2Evalue=10011&filter%2E0%2Equality=equal%2Dto
URL=https://172.16.1.175:8834/report/tags
POSTDATA =report=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1&hostname=172%2E16%2E1%2E101&seq=5498
URL=https://172.16.1.175:8834/report/details
POSTDATA =report=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1&protocol=tcp&hostname=172%2E16%2E1%2E101&seq=3380&port=445
URL=https://172.16.1.175:8834/report/delete
POSTDATA =report=3b98722d%2Df5ec%2Da565%2D7a7e%2D88335e45a5a139828ae33ca3eed1&seq=4148

It’s unclear whether or not you can download and upload scan results using a home feed; probably not, and thus Tenable face palms itself yet again.