[ale] Speed un-tar?

Jeff Hubbs jhubbslist at att.net
Wed Jul 30 00:33:42 EDT 2014


Do you have enough RAM to read from disk and write to a ramdisk or vice 
versa, whichever helps?

On 7/29/14, 6:44 PM, Jim Kinney wrote:
> Ugh. Sounds like you'll need to do it stages. Coarse grain search written
> to new files and a fine grained search on those new files.
> On Jul 29, 2014 6:08 PM, "Robert L. Harris" <robert.l.harris at gmail.com>
> wrote:
>
>> Unfortunately I can't touch the VM's configuration or the hardware
>> underneath it.  Supposedly I'm spread across a minimum of 6 "fast" disks
>> already.  I can't really go less than 10 files though as I am concerned
>> with information being spread across multiple files.  I was hoping someone
>> knew a tool/util which would rip through the data faster I had not found
>> yet.
>>
>> Robert
>>
>>
>>
>> On Tue, Jul 29, 2014 at 4:00 PM, Jim Kinney <jim.kinney at gmail.com> wrote:
>>
>>> unless you can spread that read/write load out over many, many spindles,
>>> you're stuck. Now add in the VMmust access through the virtual drive
>>> process and you've got another performance hit.
>>>
>>> You _could_ add extra drives to the VM that are hosted on a decent array
>>> (fiber channel or LA network iSCSI), copy the files to the new home in a
>>> batch and hit the 4G RAM limit.
>>>
>>> If possible, can you add more RAM to that VM?
>>>
>>>
>>> On Tue, Jul 29, 2014 at 5:10 PM, Robert L. Harris <
>>> robert.l.harris at gmail.com
>>>> wrote:
>>>> I'm working on a tool to parse through a lot of data for processing.
>>>   Right
>>>> now it's taking longer than I wish it would so I'm trying to find ways
>> to
>>>> improve the performance.  Right now it appears the biggest bottleneck
>> is
>>>> IO.  I'm looking at about 2000 directories which contain between 1 and
>>> 200
>>>> files in tar.gz format on a VM with 4 Gigs of RAM.  I need to load the
>>> data
>>>> into an array to do some pre-processing cleanup so I am currently
>>> chopping
>>>> the files in each of the directories into an array of groups of 10
>> files
>>> at
>>>> a time ( seems to be the sweet spot to prevent swap ) and then a
>> straight
>>>> forward loop of which each iteration executes:
>>>>
>>>>    tar xzOf $Loop |
>>>>
>>>> and then pushes it into my array for processing.
>>>>
>>>> I have tried:
>>>>
>>>>   gzcat $Loop | tar xO |
>>>>
>>>> which is actually slower.  Yes, I'm at the point of trying to squeeze
>>>> seconds of time out of a group.  Any thoughts of a method which might
>> be
>>>> quicker?
>>>>
>>>> Robert
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> :wq!
>>>>
>> ---------------------------------------------------------------------------
>>>> Robert L. Harris
>>>>
>>>> DISCLAIMER:
>>>>        These are MY OPINIONS             With Dreams To Be A King,
>>>>         ALONE.  I speak for                      First One Should Be A
>> Man
>>>>         no-one else.                                     - Manowar
>>>> -------------- next part --------------
>>>> An HTML attachment was scrubbed...
>>>> URL: <
>>>>
>> http://mail.ale.org/pipermail/ale/attachments/20140729/38cb3da3/attachment.html
>>>> _______________________________________________
>>>> Ale mailing list
>>>> Ale at ale.org
>>>> http://mail.ale.org/mailman/listinfo/ale
>>>> See JOBS, ANNOUNCE and SCHOOLS lists at
>>>> http://mail.ale.org/mailman/listinfo
>>>>
>>>
>>>
>>> --
>>> --
>>> James P. Kinney III
>>>
>>> Every time you stop a school, you will have to build a jail. What you
>> gain
>>> at one end you lose at the other. It's like feeding a dog on his own
>> tail.
>>> It won't fatten the dog.
>>> - Speech 11/23/1900 Mark Twain
>>>
>>>
>>> *http://heretothereideas.blogspot.com/
>>> <http://heretothereideas.blogspot.com/>*
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <
>>>
>> http://mail.ale.org/pipermail/ale/attachments/20140729/385b6337/attachment.html
>>> _______________________________________________
>>> Ale mailing list
>>> Ale at ale.org
>>> http://mail.ale.org/mailman/listinfo/ale
>>> See JOBS, ANNOUNCE and SCHOOLS lists at
>>> http://mail.ale.org/mailman/listinfo
>>>
>>
>>
>> --
>> :wq!
>> ---------------------------------------------------------------------------
>> Robert L. Harris
>>
>> DISCLAIMER:
>>        These are MY OPINIONS             With Dreams To Be A King,
>>         ALONE.  I speak for                      First One Should Be A Man
>>         no-one else.                                     - Manowar
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://mail.ale.org/pipermail/ale/attachments/20140729/e382a9b2/attachment.html
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://mail.ale.org/mailman/listinfo/ale
>> See JOBS, ANNOUNCE and SCHOOLS lists at
>> http://mail.ale.org/mailman/listinfo
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.ale.org/pipermail/ale/attachments/20140729/4b9bfb79/attachment.html>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>



More information about the Ale mailing list