Another perl question

peterchan75

Supremacy Member
Joined
Apr 26, 2003
Messages
6,494
Reaction score
457
Hi perl experts,

These 2 approaches..
1. Read entire file into an array at one go. E.g. @array = <inputfile>; Process the data from array.

2. Read the data from file line by line. E.g. $read_line = <inputfile>; Process the data while reading.

What I notice is that the former method is faster. Is my observation correct ?
Thanks in advance.
 

davidktw

Arch-Supremacy Member
Joined
Apr 15, 2010
Messages
13,396
Reaction score
1,186
Hi perl experts,

These 2 approaches..
1. Read entire file into an array at one go. E.g. @array = <inputfile>; Process the data from array.

2. Read the data from file line by line. E.g. $read_line = <inputfile>; Process the data while reading.

What I notice is that the former method is faster. Is my observation correct ?
Thanks in advance.

I/O is probably not what you experience since OS often does read ahead into file buffer or somewhere along the I/O stack, there will be block buffering. But if your file is too large, it may exhaust your memory and cause swapping. This will make your first approach non ideal.

If your 2nd approach is slower, that will most likely be the overhead of performing the operation on only a small input string.

Dependijg on how you want to process your input, one way to slurp in a full file into a string instead of an array will be as follows
Code:
{ local $/; $input = <FILEHANDLER> }
 

peterchan75

Supremacy Member
Joined
Apr 26, 2003
Messages
6,494
Reaction score
457
@davidktw,
Thanks.

Does it do this ?

Code:
while ($input_line = <filehandler>){
    .... do something.....
}

or rather your code put the entire file into a string ?
Split $input into array ?
@array = split('\n',$input);
 

davidktw

Arch-Supremacy Member
Joined
Apr 15, 2010
Messages
13,396
Reaction score
1,186
@davidktw,
Thanks.

Does it do this ?

Code:
while ($input_line = <filehandler>){
    .... do something.....
}

or rather your code put the entire file into a string ?
Split $input into array ?
@array = split('\n',$input);

You can do it your way if you prefer processing your contents line by line. slurping in contents basically treat the content like a stream, so you will need to think in a stream manner and treat carriage return and line feed just like any other characters. I can choose to process the a multiline string using regex if I like. I can also consume line by line if th code is clearer that way. There is no definite better way, it is all based on case by case basis.
 

davidktw

Arch-Supremacy Member
Joined
Apr 15, 2010
Messages
13,396
Reaction score
1,186
You can do it your way if you prefer processing your contents line by line. slurping in contents basically treat the content like a stream, so you will need to think in a stream manner and treat carriage return and line feed just like any other characters. I can choose to process the a multiline string using regex if I like. I can also consume line by line if th code is clearer that way. There is no definite better way, it is all based on case by case basis.

Just a small demo, using my response in my previous post as input for the codes below, I slurp in the text from STDIN and then continuously search for any words containing an "e" using case insensitive search. Notice I don't care about the whitespaces and I don't care for the newline and carriage characters too

Code:
$ [COLOR="Yellow"]cat ./slurp.pl
#!/usr/bin/env perl

use strict;
use warnings;

$/ = undef;
my $s = <>;
while ($s =~ /\b(\S*e\S*)\b/sig) {
  print $1, "\n";
}[/COLOR]
$ ./slurp.pl
You can do it your way if you prefer


processing your contents line by line. slurping in contents basically treat the content like a stream, so you will need to
think in a stream
manner and treat carriage return and line feed just like any other characters. I can choose to process the a multiline string using regex if I like. I can also consume line by line if th code is clearer that way. There is no definite better way, it is all based on case by case basis.
prefer
processing
contents
line
line
contents
treat
the
content
like
stream
need
stream
manner
treat
carriage
return
line
feed
like
other
characters
choose
process
the
multiline
regex
like
consume
line
line
code
clearer
There
definite
better
based
case
case
 
Last edited:

peterchan75

Supremacy Member
Joined
Apr 26, 2003
Messages
6,494
Reaction score
457
@davidktw,
Thanks.
To a layman, many ways to do the same thing. To an expert, each way has it strengths and weaknesses.:o
 

davidktw

Arch-Supremacy Member
Joined
Apr 15, 2010
Messages
13,396
Reaction score
1,186
@davidktw,
Thanks.
To a layman, many ways to do the same thing. To an expert, each way has it strengths and weaknesses.:o

And so the layman's responsibility is to understand the depth of each way. No expert is born one :)
 

ipevery

Senior Member
Joined
Feb 9, 2001
Messages
703
Reaction score
2
Just to add, it also depend on the task that u are performing. If slurping into an array makes your data structure easier to read/understand/manupilate, then it worth that slight overhead. Data structure rules as far as programming preference is concerned. Reading it line by line may be efficient but it may make your code less readable (IMHO).

I/O is probably not what you experience since OS often does read ahead into file buffer or somewhere along the I/O stack, there will be block buffering. But if your file is too large, it may exhaust your memory and cause swapping. This will make your first approach non ideal.

If your 2nd approach is slower, that will most likely be the overhead of performing the operation on only a small input string.

Dependijg on how you want to process your input, one way to slurp in a full file into a string instead of an array will be as follows
Code:
{ local $/; $input = <FILEHANDLER> }
 

bagyidaw

Senior Member
Joined
Feb 19, 2012
Messages
2,417
Reaction score
190
You have to take note that Slurping entire file will take more memory than reading line by line.
As suggested by ipevery, choose what's more suitable for the task.
There is also a module call Tie::File that represents file contents as perl array.
 
Important Forum Advisory Note
This forum is moderated by volunteer moderators who will react only to members' feedback on posts. Moderators are not employees or representatives of HWZ. Forum members and moderators are responsible for their own posts.

Please refer to our Community Guidelines and Standards, Terms of Service and Member T&Cs for more information.
Top