Re: [SLUG] Is there an archive for the list?

From: Ed Centanni (ecentan1@tampabay.rr.com)
Date: Thu Sep 11 2003 - 19:50:57 EDT


I have a personal mail archive of the SLUG list that dates from
11/30/1999 to the present. The file size is 77,130,688 bytes.
Realizing what a great knowledge-base it represents and desiring to make
use of it, I started a small project to create a searchable index
database of ALL aspects of the SLUG list. I call it "emine", short for
"E-mail Data Mining".

It's a python script that parses a standard unix-style email mailbox
file and builds an SQL database of string lengths and file offset
pointers into the mail file. The database doesn't contain the actual
email, it just stores the location and size of everything in the mailbox
file. The index database is normalized and has 9 related tables that
contain information for headertypes, mimetypes, mailboxes, messages,
headers, words, attachments, word locations, and phrases from 2 to 4 words.

It's a work in progress. At the moment it can populate all the tables
except words, word locations, and phrases. It shouldn't take me more
than a few evenings to finish that up. It would need a user friendly
frontend to be useful as a search engine but that shouldn't be rocket
science once the database is fully populated. The front end would get
user input, query the database, fseek to the offset(s) in the mailbox
file and output the results.

If this seems interesting to any of you, I'm willing to put it up as an
open source project for anyone to work on and use. If you know of a
similar project already available I'd like to know.

Ed.

Michael Manchester wrote:

> I thought at one time there was an archive of the list
> or at least talk about having an archive of the list.
>
> Mike M.
>
> =====
<snip>

-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:33:12 EDT