[ale] db insert driving me nuts

Jim Kinney jim.kinney at gmail.com
Thu Nov 5 17:37:42 EST 2009


I like a modified 2 pass approach. Insert all U's and park !U
elsewhere. Now update original with summing for counters and overwrite
for type all keyed on ID.

I've been trying to do it the other way around and a simple update is
far easier.

temp tables are my friend :-)

On Thu, Nov 5, 2009 at 5:12 PM, Tim Watts <timtw at earthlink.net> wrote:
> Presumably an ID's type could be described by the set ([ABCDE],X) but not
> (A,B,C,D,E,X), right?
>
> What about a 2 pass approach where on pass (1) you insert all the non-U's and
> write the U's to a temp file; then on pass (2) you insert/update the rows from
> the temp U file?
>
> I guess you could also "(cat f1; cat f2) | sort" but I don't remember if sort
> has an upper limit on input size.
>
> Neither of these scream "zippy" but may work if I'm understanding the problem
> correctly.
>
>
> On Thursday 05 November 2009 3:14:21 pm Jim Kinney wrote:
>> I'm working on a postgresql db and have a data file I need to
>> insert/mung and insert and I'm stumped on the following:
>>
>> data is  as below
>>
>> ID   counter_1   counter_2  counter_total   type
>> 1       100              20            120             A
>> 1         15                3              18             U
>> 2         23               22            45              C
>> 2        125              55             180           U
>> 3        15              120            135            A
>> 4        75               25              100           B
>> ...
>>
>> I need to sum the counter per ID. That's easy. But I also need to
>> adjust the type U to be A,B,C,D or E if the ID ALSO has a type U but
>> NOT if the ID has ONLY a type U
>>
>> So the final data is compacted to be:
>>
>> ID   counter_1   counter_2  counter_total   type
>> 1       115              23            138             A
>> 2       148             77              225             C
>> 3        15              120            135            A
>> 4        75               25              100           B
>>
>>
>> The best part is the source of these data sets is two separate files
>>
>> One file has ID, counter_1, type
>> the other has ID, counter_2, type
>>
>> neither file has counter_total as it's a sum of counter_1 and counter_2
>>
>> each file is millions of lines long.....
>>
>> --
>
> --
> Before you embark on a journey of revenge, dig two graves.
>  -- Confucius
>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
>



-- 
-- 
James P. Kinney III
Actively in pursuit of Life, Liberty and Happiness



More information about the Ale mailing list