[ale] 117000 files vs 240 missing - amazon

David Ritchie deritchie at gmail.com
Mon Nov 25 00:51:20 EST 2013


I suggest gzip (or a mutually agreeable archive format) the file structure
and sending one
file...


On Fri, Nov 22, 2013 at 7:50 AM, Lightner, Jeff <JLightner at water.com> wrote:

>  Long directory structures involved.  In fact on our initial attempt we
> found that it didn’t download everything because the default behavior of
> wget is to only go down 5 levels so we had restarted with 99 levels the max
> it would allow.  I don’t think we had any that actually hit 99 levels but
> we probably ought to verify that.
>
>
>
> The find was a straight forward find with no flags initially.    Later
> find for –type f was done then another more complicated one done just to
> show directories.   Adding those together resulted in the same total as the
> initial find and wget summary.
>
>
>
> We tried NLIST (LIST not available) but it doesn’t do recursion at the
> remote site.
>
>
>
> *From:* ale-bounces at ale.org [mailto:ale-bounces at ale.org] *On Behalf Of *David
> Tomaschik
> *Sent:* Thursday, November 21, 2013 8:43 PM
> *To:* Atlanta Linux Enthusiasts
> *Subject:* Re: [ale] 117000 files vs 240 missing - amazon
>
>
>
> Is it all in one directory, or was there directory structure transferred?
>  What were the predicates to your find command?  (Thinking their count
> might've included directories or something.)
>
>
>
> On Thu, Nov 21, 2013 at 1:59 PM, Lightner, Jeff <JLightner at water.com>
> wrote:
>
> A vendor put a site on Amazon with some files we need.   We don’t have
> sftp access to this Amazon site but do have ftp access.
>
>
>
> Accordingly we did a wget to download all the files using our ftp
> credentials.    When all done we got over 117,000 files and saw no errors
> in the wget.
>
>
>
> The problem is vendor is telling our director there are 240 more files in
> their count than we downloaded.    This is less than a 0.2% difference so I
> suspect it has something to do with the way they count vs. the way we did.
> (We used find piped to wc –l.)   Our count matches the summary wget output
> when it finished so we are sure we’re correctly counting what wget did but
> of course it’s possible wget actually missed something though it seems
> unlikely to me.
>
>
>
> The question is does anyone know what might cause such a difference?
> Alternative does anyone know another way we could count the files on the
> Amazon site using our ftp credentials other than going in and counting them
> one by one?
>
>
>
> We’re trying to find out how the vendor did their count but I was hoping
> someone already knows of some vagary on Amazon sites that would cause this
> kind of discrepancy.
>
>
>
>
>
>
>
>
>
> Athena®, Created for the Cause™
>
> Making a Difference in the Fight Against Breast Cancer
>
>
>
>
>
> *How and Why I Should Support Bottled Water!*
> Do not relinquish your right to choose bottled water as a healthy
> alternative to beverages that contain sugar, calories, etc. Your support of
> bottled water will make a difference! Your signatures count! Go to
> http://www.bottledwatermatters.org/luv-bottledwater-iframe/dswaters and
> sign a petition to support your right to always choose bottled water. Help
> fight federal and state issues, such as bottle deposits (or taxes) and
> organizations that want to ban the sale of bottled water. Support community
> curbside recycling programs. Support bottled water as a healthy way to
> maintain proper hydration. Our goal is 50,000 signatures. Share this
> petition with your friends and family today!
>
>
>
> ---------------------------------
> CONFIDENTIALITY NOTICE: This e-mail may contain privileged or confidential
> information and is for the sole use of the intended recipient(s). If you
> are not the intended recipient, any disclosure, copying, distribution, or
> use of the contents of this information is prohibited and may be unlawful.
> If you have received this electronic transmission in error, please reply
> immediately to the sender that you have received the message in error, and
> delete it. Thank you.
> ----------------------------------
>
>
>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
>
>
>
>
> --
> David Tomaschik
> OpenPGP: 0x5DEA789B
> http://systemoverlord.com
> david at systemoverlord.com
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20131125/17438a45/attachment.html>


More information about the Ale mailing list