[ale] best way to copy 3Tb of data

Jim Kinney jkinney at jimkinney.us
Tue Oct 27 11:38:36 EDT 2015


I implemented a cron job to delete scratch data created over 30 days ago. That didn't go well with the people who were eating up all space and not paying for hard drives. So I gave them a way to extend particular areas up to 90 days. Day 91 it was deleted. So they wrote a script to copy their internet archive around every 2 weeks to keep the creation date below the 30 day cut off. So I shrunk the partition of /scratch to about 10G larger than was currently in use. He couldn't do his runs to graduate in time without cleaning up his mess. It also pissed off other people and they yelled at him when I gave my report of who the storage hog was.

On October 27, 2015 11:24:48 AM EDT, Todor Fassl <fassl.tod at gmail.com> wrote:
>I dunno.  First of all, I don't have any details on what's going on on 
>the HPC cluster. All I know is the researcher says he needs to back up 
>his  3T of scratch data because they are telling him it will be erased 
>when they upgrade something or other. Also, I don't know how you can 
>have 3T of scratch data or why, if it's scratch data, it can't just be 
>deleted. I come across this all the time though. Researchers pretty 
>regularly generate 1T+ of what they insist is scratch data.
>
>In fact, I've had this discussion with this very same researcher. He's 
>not the only one who does this but he happens to be the guy who i last 
>questioned about it. You know this "scratch" space isn't backed up or 
>anything. If the NAS burns up or if you type in the wrong rm command, 
>it's gone. No problem, it's just scratch data. Well, then how come I 
>can't just delete it when I want to re-do the network storage device?
>
>They get mad if you push them too hard.
>
>
>
>
>
>On 10/27/2015 09:45 AM, Jim Kinney wrote:
>> Dumb question: Why is data _stored_ on an HPC cluster? The storage
>for
>> an HPC should be a separate entity entirely. It's a High Performance
>> cluster, not a Large Storage cluster. Ideally, a complete teardown
>and
>> rebuild of an HPC should have exactly zero impact on the HPC users'
>> data. Any data kept on the local space of an HPC is purely
>scratch/temp
>> data and is disposable with the possible exception of checkpoint data
>> and that should be written back to the main storage and deleted once
>the
>> full run is completed.
>>
>> On Tue, 2015-10-27 at 08:33 -0500, Todor Fassl wrote:
>>> One of the researchers I support wants to backup 3T of data to his
>space
>>> on our NAS. The data is on an HPC cluster on another network. It's
>not
>>> an on-going backup. He just needs to save it to our NAS while the
>HPC
>>> cluster is rebuilt. Then he'll need to copy it right back.
>>>
>>> There is a very stable 1G connection between the 2 networks. We have
>>> plenty of space on our NAS. What is the best way to do the caopy?
>>> Ideally, it seems we'd want to have boththe ability to restart the
>copy
>>> if it fails part way through and to end up with a compressed archive
>>> like a tarball. Googling around tends to suggest that it's eitehr
>rsync
>>> or tar. But with rsync, you wouldn't end up with a tarball. And with
>>> tar, you can't restart it in the middle. Any other ideas?
>>> Since the network connection is very stable, I am thinking of
>suggesting
>>> tar.
>>>
>>> tar zcvf - /datadirectory | sshuser at backup.server
><mailto:user at backup.server>  "cat > backupfile.tgz"
>>>
>>> If the researcher would prefer his data to be copied to our NAS as
>>> regular files, just use rsync with compression. We don't have an
>rsync
>>> server that is accessible to the outside world. He could use ssh
>with
>>> rsync but I could set up rsync if it would be worthwhile.
>>>
>>> Ideas? Suggestions?
>>>
>>>
>>>
>>> on at the far end.
>>>
>>> He is going to need to copy the data back in a few weeks. It might
>even
>>> be worthwhile to send it via tar without uncompressing/unarchiving
>it on
>>> receiving end.
>>>
>>>
>>>
>>> _______________________________________________
>>> Ale mailing list
>>> Ale at ale.org <mailto:Ale at ale.org>
>>> http://mail.ale.org/mailman/listinfo/ale
>>> See JOBS, ANNOUNCE and SCHOOLS lists at
>>> http://mail.ale.org/mailman/listinfo
>>
>> --
>> James P. Kinney III
>>
>> Every time you stop a school, you will have to build a jail. What you
>> gain at one end you lose at the other. It's like feeding a dog on his
>> own tail. It won't fatten the dog.
>> - Speech 11/23/1900 Mark Twain
>>
>> http://heretothereideas.blogspot.com/
>>
>>
>>
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://mail.ale.org/mailman/listinfo/ale
>> See JOBS, ANNOUNCE and SCHOOLS lists at
>> http://mail.ale.org/mailman/listinfo
>>
>
>-- 
>Todd
>_______________________________________________
>Ale mailing list
>Ale at ale.org
>http://mail.ale.org/mailman/listinfo/ale
>See JOBS, ANNOUNCE and SCHOOLS lists at
>http://mail.ale.org/mailman/listinfo

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20151027/4dba2d67/attachment.html>


More information about the Ale mailing list