Re: [SLUG] perl [pig] duplicate removal with a twist

From: Dylan William Hardison (dylan@hardison.net)
Date: Tue Jul 25 2006 - 00:33:05 EDT


Spake baris nema on Monday, July 24, 2006 at 05:44PM -0400:
> The input files I'm processing is currently on the order of 6megs
> (~100,000 lines), so
> I'm thinking an array is out, the output file is smaller, usually.
>
> Would it be possible to search the output file without having to close
> it (maybee using
> a different file handle)?
>
> I'm currently doing the processing using bash, but due to the number of
> loops want to
> port it over to perl, I'm quite new at perl, so any code examples would
> be helpfull.
>
> currently it's (perlified pseudo code):
>
> read line in from input file;
> $criticalpart = result of several operations on line;
> $existstat = `grep -c "$criticalpart" "$outputfile"`; #trying to
> replace this line with perl code
> if ( $existstat == 0 );
> {
> $processedcriticalpart = [some more operations on $criticalpart];
> print $outputfile "$processedcriticalpart";
> }

What does the "critcalpart" look like?

And yes, you can 'rewind' a file and start reading/writing from the
beginning.

see perldoc -f seek for rewinding. Note, you'd also need to open the
file for both reading *and* writing.

Anyway, if the critical part is always the same size (or a decidable
size), you won't have to do this ugly rewinding thing... Give us some
more details. :-)

-- 
"355/113 -- Not the famous irrational number PI, but an incredible simulation!"
-
GPG Fingerprint: E3CD FDAB 82C4 14FD 7B57  430B 770E 0EAF FB53 12C2
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS).  Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 15:03:19 EDT