Re: [SLUG] Web Page Storage

From: John Pugh (jpugh@novell.com)
Date: Thu May 11 2006 - 18:33:30 EDT


>>> On Thu, May 11, 2006 at 9:41 AM, in message
<200605110941.16499.s0tl155360@earthlink.net>, S0TL
<s0tl155360@earthlink.net>
wrote:
> Hi All
>
> Small website issue retention issue.
>
> Story
> In our business we visit a lot of random websites from which we
grater
> information. Currently we print the relevant information and place it
into a
>
> file. A pen and ink system that is fastly exceeding our ability to
manage.
>
> I would like to move much of this to a computer but have no idea of
how.
>
> If I take a screen shot of the relevant websites I am limited in size
as to
> what appears on the screen which does not even began to cover the
length of
> some of the documents.
>
> If I save the information in HTML I then have a dozen or so folders
and sub
> folders for the various pictures that are included with the central
data.
>
> In MS Windows I can print a file to a file by one of the printer
functions.
>
> Not sure if this can be done in Linux but the results in Windows is
that you
>
> have a print file which if memory serves me correctly is not
searchable.
>
> If I high- light, copy, and past into OpenOffice or any other word
processor
> what I get may be usable then again it may not be usable which does
mitigate
>
> the fact that for a large number of web pages, thousands, this is
really
> inconvenient.
>
> So my question is how can one save what is open on a website say as a
MS
> Office .doc or OpenOffice OO file which is searchable, has the same
> information, and does not take 10 minutes or so per page to do? [I am
assume
>
> of course that the website is something like HTML not something like
Adobe
> Acrobat.]
>
> Thanks
> Frank
>

You can always use SUSE Linux 10.1, max out the cache size in FF and
then use beagle to index it all...a quick search will search the index
and away you go...searchable files stored locally.

You can also use Beagle to put the page in it's search index and leave
it where it resides on the 'net and still search it.
There are many ways of skinning the proverbial cat.

JP
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 18:53:50 EDT