|
IDS Forum
Re: Deleting duplicate rows from a table.
Posted By: Syed Ahmad Najmi Date: Tuesday, 25 January 2005, at 8:29 p.m.
In Response To: Re: Deleting duplicate rows from a table. (pamadeo@cespi.unlp.edu.ar)
--------------Boundary-00=_06HWQL80000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_06HWLVC0000000000000"
--------------Boundary-00=_06HWLVC0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi =0D =0D I faced the similar problem before.=0D What i did was :-=0D a. create a new table with similar table structure=0D b.do select query from the table which u got the duplication=0D =0D example:-=0D =0D unload to /tmp/filename.unl=0D select field_1, field_2 , field_3 . . . . .=0D from table_name=0D group by field_1, field_2 , field_3 . . . . .=0D =0D * Note : assuming the duplication happen on all the columns. The unload w= ill only store non duplicate records.=0D =0D c. load the unload file into a new table=0D =0D Hope this will work.=0D =0D Bye=0D =0D =0D =0D =0D =0D =0D =0D Syed Ahmad Najmi Syed Md Nasir=0D Senior System Analyst=0D Century Software Malaysia Sdn Bhd=0D 57-5 Block G, Jalan PJU 1/37=0D Dataran Prima=0D 47301 Petaling Jaya, Selangor=0D Ph: (60 3) 7804 4464=0D Fax: (60 3) 7804 4494=0D http://www.CenturySoftware.com.au=0D =0D This E-mail from Century Software Pty Ltd expresses the views of the send= er and not necessarily the views of the Company. The E-mail and any=0D files transmitted with it are confidential to the intended recipient at t= he E-mail address to which it has been addressed. The E-mail may not be=0D disclosed or used by any other than the addressee, nor may it be copied i= n any way. If you are not the intended recipient please contact the sender = as soon as possible and delete any copies of this message. Please note that although this E-mail has been checked, we cannot accept any responsibilit= y for any transmitted viruses. It is therefore your responsibility to virus scan attachments (if any).=0D =0D =0D -------Original Message-------=0D =0D From: pamadeo@cespi.unlp.edu.ar=0D Date: 01/25/05 22:45:05=0D To: ids@iiug.org=0D Subject: Re: Deleting duplicate rows from a table. [4099]=0D =0D Hi, we use rowid column for did something like it.=0D You can create another table with the same structure of original table an= d=0D create a primary key or an unique index over that table. After that scan=0D original table and insert records in the new table, with an exception blo= ck for=0D cougth it.=0D =0D Paola=0D =0D Mensaje citado por manoj wadhwa <itm_manoj@yahoo.com>:=0D =0D > Hi,=0D >=0D > I have got a big table containing about 350 million=0D > records. There are some duplicate records in the table=0D > ( around 80,000). Please suggest some way to delete=0D > the duplicate records keeping one copy of them.=0D >=0D > TIA,=0D > Manoj=0D >=0D > Background info : We loaded the data using HPL.=0D > Because of space crunch, the HPL stopped inbetween.=0D > After adding extra chunks, when we restarted it, it=0D > loaded the same unl files again which are causing the=0D > trouble.=0D >=0D >=0D >=0D >=0D >=0D > __________________________________=0D > Do you Yahoo!?=0D > Yahoo! Mail - 250MB free storage. Do more. Manage less.=0D > http://info.mail.yahoo.com/mail_250=0D >=0D >=0D >=0D =0D =0D =0D =2E --------------Boundary-00=_06HWLVC0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
<HTML><HEAD> <META http-equiv=3DContent-Type content=3D"text/html; charset=3Diso-8859-= 1"> <META content=3D"IncrediMail 1.0" name=3DGENERATOR></HEAD> <BODY style=3D"BACKGROUND-POSITION: 0px 0px; FONT-SIZE: 12pt; MARGIN: 5px= 10px 10px; FONT-FAMILY: Arial" bgColor=3D#ffffff background=3D"" scroll=3D= yes ORGYPOS=3D"0"> <TABLE id=3DINCREDIMAINTABLE cellSpacing=3D0 cellPadding=3D2 width=3D"100= %" border=3D0> <TBODY> <TR> <TD id=3DINCREDITEXTREGION style=3D"FONT-SIZE: 12pt; CURSOR: auto; FONT-F= AMILY: Arial" width=3D"100%"> <DIV><FONT face=3D"Comic Sans MS" size=3D2>Hi </FONT></DIV> <DIV> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>I faced the similar= problem before.</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>What i did was :-</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>a. create a new table wit= h similar table structure</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>b.do select query from the tab= le which u got the duplication</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>example:-</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>unload to /tmp/filename.= unl</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>select field_1, field_2 , fiel= d_3 . . . . .</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>from table_name</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>group by field_1, field_2 , fi= eld_3 . . . . .</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>* Note : assuming the duplicat= ion happen on all the columns. The unload will only store non duplicate r= ecords.</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>c. load the unload file i= nto a new table</FONT></DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>Hope this will work.</FONT></D= IV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2>Bye</FONT></DIV> <DIV> </DIV> <DIV> </DIV> <DIV><FONT face=3D"Comic Sans MS" size=3D2></FONT> </DIV> <DIV> </DIV> <DIV> </DIV> <DIV> </DIV> <DIV><IMG id=3DINCREDI_SIGIMG src=3D"cid:58C56123-90FC-41F0-8CC7-16ACAC35= AB34"></DIV> <DIV><SPAN style=3D"FONT-SIZE: 10pt"> <DIV> <P><SPAN style=3D"FONT-SIZE: 10pt"><FONT size=3D1><STRONG><FONT style=3D"= BACKGROUND-COLOR: #ffff00" color=3D#fe0000>Syed</FONT> <FONT style=3D"BAC= KGROUND-COLOR: #ffff00" color=3D#fe0000>Ahmad</FONT> <FONT style=3D"BACKG= ROUND-COLOR: #ffff00" color=3D#fe0000>Najmi</FONT> <FONT style=3D"BACKGRO= UND-COLOR: #ffff00" color=3D#fe0000>Syed</FONT> <FONT style=3D"BACKGROUND= -COLOR: #ffff00" color=3D#fe0000>Md</FONT> <FONT style=3D"BACKGROUND-COLO= R: #ffff00" color=3D#fe0000>Nasir</FONT><BR></STRONG>Senior System Analys= t<BR>Century Software Malaysia <FONT style=3D"BACKGROUND-COLOR: #ffff00" = color=3D#fe0000>Sdn</FONT> <FONT style=3D"BACKGROUND-COLOR: #ffff00" colo= r=3D#fe0000>Bhd</FONT><BR>57-5 Block G, <FONT style=3D"BACKGROUND-COLOR: = #ffff00" color=3D#fe0000>Jalan</FONT> PJU 1/37<BR><FONT style=3D"BACKGROU= ND-COLOR: #ffff00" color=3D#fe0000>Dataran</FONT> Prima<BR>47301 <FONT st= yle=3D"BACKGROUND-COLOR: #ffff00" color=3D#fe0000>Petaling</FONT> <FONT s= tyle=3D"BACKGROUND-COLOR: #ffff00" color=3D#fe0000>Jaya</FONT>, <FONT sty= le=3D"BACKGROUND-COLOR: #ffff00" color=3D#fe0000>Selangor</FONT><BR><FONT= style=3D"BACKGROUND-COLOR: #ffff00" color=3D#fe0000>Ph</FONT>: (60 3) 78= 04 4464<BR>Fax: (60 3) 7804 4494<BR></FONT><A title=3D"http://<FONT" href= =3D"http://www.centurysoftware.com.au/" target=3D_blank color=3D"#fe0000"= ><STRONG><FONT size=3D1>http://<FONT style=3D"BACKGROUND-COLOR: #ffff00" = color=3D#fe0000>www</FONT>.<FONT style=3D"BACKGROUND-COLOR: #ffff00" colo= r=3D#fe0000>CenturySoftware</FONT>.<FONT style=3D"BACKGROUND-COLOR: #ffff= 00" color=3D#fe0000>com</FONT>.au</FONT></STRONG></A><BR></SPAN></P> <P><FONT size=3D1><SPAN style=3D"FONT-SIZE: 10pt"><FONT size=3D1>This E-m= ail from Century Software <FONT style=3D"BACKGROUND-COLOR: #ffff00" color= =3D#fe0000>Pty</FONT> Ltd expresses the views of the sender and not neces= sarily the views of the Company. The E-mail and any<BR>files transmitted = with it are confidential to the intended recipient at the E-mail address = to which it has been addressed. The E-mail may not be<BR>disclosed or use= d by any other than the addressee, nor may it be copied in any way. If yo= u are not the intended recipient please contact the sender as soon as pos= sible and delete any copies of this message. Please note that although th= is E-mail has been checked, we cannot accept any responsibility for any t= ransmitted viruses. It is therefore your responsibility to virus scan att= achments (if any)</FONT>.<BR style=3D"mso-special-character: line-break">= <BR style=3D"mso-special-character: line-break"><?xml:namespace prefix =3D= o ns =3D "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN><= /FONT></P></DIV></SPAN></DIV> <DIV id=3DIncrediOriginalMessage><I>-------Original Message-------</I></D= IV> <DIV> </DIV> <DIV id=3Dreceivestrings> <DIV dir=3Dltr style=3D"FONT-SIZE: 11pt"><I><B>From:</B></I> <A href=3D"m= ailto:pamadeo@cespi.unlp.edu.ar">pamadeo@cespi.unlp.edu.ar</A></DIV> <DIV dir=3Dltr style=3D"FONT-SIZE: 11pt"><I><B>Date:</B></I> 01/25/05 22:= 45:05</DIV> <DIV dir=3Dltr style=3D"FONT-SIZE: 11pt"><I><B>To:</B></I> <A href=3D"mai= lto:ids@iiug.org">ids@iiug.org</A></DIV> <DIV dir=3Dltr style=3D"FONT-SIZE: 11pt"><I><B>Subject:</B></I> Re: Delet= ing duplicate rows from a table. [4099]</DIV></DIV> <DIV> </DIV> <DIV>Hi, we use rowid column for did something like it.</DIV> <DIV>You can create another table with the same structure of original tab= le and</DIV> <DIV>create a primary key or an unique index over that table. After that = scan</DIV> <DIV>original table and insert records in the new table, with an exceptio= n block for</DIV> <DIV>cougth it.</DIV> <DIV> </DIV> <DIV>Paola</DIV> <DIV> </DIV> <DIV>Mensaje citado por manoj wadhwa <<A href=3D"mailto:itm= _manoj@yahoo.com">itm_manoj@yahoo.com</A>>:</DIV> <DIV> </DIV> <DIV>> Hi,</DIV> <DIV>></DIV> <DIV>> I have got a big table containing about 350 million</DIV> <DIV>> records. There are some duplicate records in the table</DIV> <DIV>> ( around 80,000). Please suggest some way to delete<= /DIV> <DIV>> the duplicate records keeping one copy of them.</DIV= > <DIV>></DIV> <DIV>> TIA,</DIV> <DIV>> Manoj</DIV> <DIV>></DIV> <DIV>> Background info : We loaded the data using HPL.</DIV> <DIV>> Because of space crunch, the HPL stopped inbetween.<= /DIV> <DIV>> After adding extra chunks, when we restarted it, it</DIV> <DIV>> loaded the same unl files again which are causing the</DIV> <DIV>> trouble.</DIV> <DIV>></DIV> <DIV>></DIV> <DIV>></DIV> <DIV>></DIV> <DIV>></DIV> <DIV>> __________________________________</DIV> <DIV>> Do you Yahoo!?</DIV> <DIV>> Yahoo! Mail - 250MB free storage. Do more. Manage less.</DIV> <DIV>> <A href=3D"http://info.mail.yahoo.com/mail_250">http://info.mai= l.yahoo.com/mail_250</A></DIV> <DIV>></DIV> <DIV>></DIV> <DIV>></DIV> <DIV> </DIV> <DIV> </DIV> <DIV> </DIV> <DIV>.</DIV></TD></TR> <TR> <TD id=3DINCREDIFOOTER width=3D"100%"> <TABLE cellSpacing=3D0 cellPadding=3D0 width=3D"100%"> <TBODY> <TR> <TD width=3D"100%"></TD> <TD id=3DINCREDISOUND vAlign=3Dbottom align=3Dmiddle></TD> <TD id=3DINCREDIANIM vAlign=3Dbottom align=3Dmiddle></TD></TR></TBODY></T= ABLE></TD></TR></TBODY></TABLE><SPAN id=3DIncrediStamp><SPAN dir=3Dltr><A= href=3D"http://www.incredimail.com/index.asp?id=3D54475"><IMG alt=3D"Add= FUN to your email - CLICK HERE!" hspace=3D0 src=3D"http://www2.incredima= il.com/contents/stamps/imstp_emo_en.gif" align=3Dbaseline border=3D0></A>= </SPAN></SPAN></BODY></HTML> --------------Boundary-00=_06HWLVC0000000000000--
--------------Boundary-00=_06HWQL80000000000000 Content-Type: image/gif; name="sg-0.gif" Content-Transfer-Encoding: base64 Content-ID: <58C56123-90FC-41F0-8CC7-16ACAC35AB34>
R0lGODlheQBAAPABAAAAAL29vSH5BAEAAAEALAAAAAB5AEAAAAL+jI+py+0Po5y0Wgmu3rxHAHri SE5ZVqZqiqLrC19tTNfQaef6Me8+3fsJWbyhceRyHZeaJPMpK0KnH4SSilU4s1yttAsOBMPdMRm7 PWfTaqq5DX3DmbPr/IgL3emGk9K+Z4Mj9hfoMzjYZ7iDKMYDuPiiR2gFGakyOUl5WfPnyQnk9KkI SlQk+sVgWbqg6eq4qQWyymoF27fVYpdZe0OKK6X7WNjrMNZIGqKZWpzAxgvr10DLSgx867ds2+y1 fZ2Up8rd7R2NWyc+znx7vons7KGN1grPnnhFrTor75auuPxq3Y1ZcyCNerAv3zmF88QRxPRwkcF9 JSiCchURScY0i8NYbCyljN+Gj7UsjkxIsprIhShRqpvW0uVLDgxnyqhpswLOnDp5+vwJNKjQoUSL GvVQAAA7
--------------Boundary-00=_06HWQL80000000000000--
Messages In This Thread
- Deleting duplicate rows from a table.
manoj wadhwa -- Tuesday, 25 January 2005, at 4:17 a.m.
- Re: Deleting duplicate rows from a table.
emthornber@iee.org -- Tuesday, 25 January 2005, at 6:53 a.m.
- Re: Deleting duplicate rows from a table.
pamadeo@cespi.unlp.edu.ar -- Tuesday, 25 January 2005, at 8:21 a.m.
- Re: Deleting duplicate rows from a table.
Syed Ahmad Najmi -- Tuesday, 25 January 2005, at 8:29 p.m.
- Re: Deleting duplicate rows from a table.
Jean Sagi -- Wednesday, 26 January 2005, at 9:19 a.m.
- RE: Deleting duplicate rows from a table.
manoj wadhwa -- Friday, 28 January 2005, at 7:13 a.m.
IDS Forum is maintained by Administrator with WebBBS 5.12.
|
|