andygates: (Default)
[personal profile] andygates
So it comes to pass (thick with irony) that I'm involved in the organisation's web logs and all that jazz. These logs are currently dumping out as text files from the proxy servers, three in all, so each day I get about 1.5Gb (no, really) of logs.

Currently I'm manually importing them into a MSSQL database and, for reasons of management's own, each day's logfile ends up in a separate table, ideal for difficult and tedious analysis. Clearly, I'll be automating that in just a few days' time.

But I mean, a gig and a half daily? That's half a terabyte in a year! I know we're enterprise-class, but that's a stuposterously humungous wodge of data. It's particularly unwieldy when (as has happened) I'm asked to mine it for, say, J Random User's access to see if he's been doing "anything naughty".

It strikes me that there's a trick we're missing. We need historical logs, because evidence of naughtiness is a long-term thing. But a terabyte database in a thousand tables is daft. How do real organisations handle this?

Re: Mine's bigger than yours

Date: 2006-07-05 07:10 pm (UTC)
From: [identity profile] gedhrel.livejournal.com
Oh, and sawmill is also good. (We have a lot of in-house stuff too, most of which tries to take the morass of raw stuff and produce useful things out of it.)

Profile

andygates: (Default)
andygates

April 2017

S M T W T F S
      1
2345678
9 101112131415
16171819202122
23242526272829
30      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 20th, 2026 05:44 pm
Powered by Dreamwidth Studios