Re: [SLUG] perl [pig] duplicate removal with a twist

From: Levi Bard (taktaktaktaktaktaktaktaktaktak@gmail.com)
Date: Mon Jul 24 2006 - 16:40:00 EDT


On 7/24/06, baris nema <baris_nema@cigflorida.com> wrote:
> I'm writing a perl program that takes input from one file, processes it
> (line by line), and then
> puts in into another text file. I'm trying to figure out how I can
> search the second text file to see
> if a particular part that line (at the beginning of the line) has
> already been put into the file, and if
> it has, don't put it in.
>
> What I'm having trouble on is how to search the output file in perl for
> that text before writing to it.
> -any ideas?

Unless these files have the potential to be huge, I'd read the input
into a list (or array, whatever), and do the duplicate check with the
list upon input. If the order of the input doesn't matter, you could
do an insertion sort while reading, and make the duplicate check much
quicker.

If neither of those are viable, you're going to be
opening/searching/closing the second file for each line of input.

-- 
Tcsh: Now with higher FPS!
http://www.gnu.org/philosophy/shouldbefree.html
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS).  Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 15:02:38 EDT