Re: [SLUG] perl [pig] duplicate removal with a twist

From: Chris Moore (cmoore@washpat.com)
Date: Mon Jul 24 2006 - 16:38:40 EDT


You could write every string to an array and then compare every new line
to all the elements in the array. If there is no match push the new
string onto the array and then write the lines out to a file after your
done processing the original file. This may be kind of slow if your
working with a big file. You could also use the line your working with
as a regular expression and check for a match in the file your writing
too. These may not be the most efficient ways of doing it but should get
the job done. You also might want to be careful with end of line
characters, tabs and so forth when doing any kind of comparison.

baris nema wrote:
> I'm writing a perl program that takes input from one file, processes
> it (line by line), and then
> puts in into another text file. I'm trying to figure out how I can
> search the second text file to see
> if a particular part that line (at the beginning of the line) has
> already been put into the file, and if
> it has, don't put it in.
> What I'm having trouble on is how to search the output file in perl
> for that text before writing to it. -any ideas?
>
> -----------------------------------------------------------------------
> This list is provided as an unmoderated internet service by Networked
> Knowledge Systems (NKS). Views and opinions expressed in messages
> posted are those of the author and do not necessarily reflect the
> official policy or position of NKS or any of its employees.
>
>
>

-- 
Chris Moore
cmoore@washpat.com

----------------------------------------------------------------------- This list is provided as an unmoderated internet service by Networked Knowledge Systems (NKS). Views and opinions expressed in messages posted are those of the author and do not necessarily reflect the official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 15:02:22 EDT