getting duplicates in S3 data upon ingest
None
A customer contacted me via ticket 28381 regarding an issue that he was having with one of his S3 jobs creating duplicate entries of selected key fields upon ingest. He wanted to see if there was a way within Gainsight to remove these entries by default within Gainsight.
Here is the initial description of the issue per the customer:
[i]I'm working with the S3 connectors, and for each ingest job, there is a section that says: "Select key fields to identify unique records".
[i]With the data that I have getting exported into the S3 bucket that Gainsight pulls to ingest, I have some duplicates, so I don't want those getting written as new rows in the upsert to the MDA.
[i]With the "Select key fields to identify unique records" function, I was hoping these fields would essentially combine to create a composite unique key (I have 3 fields chosen for this function) like you could create in a regular MySQL database.[i]However, when I run the ingest, I still get duplicate rows where all 3 values are the same between the duplicate rows.
The last development on this issue was that the client was going to try and remove these duplicate entries before ingest. What is the possibility of adding a duplicate checker (or something similar) into the UI for the S3 connector?
Here is the initial description of the issue per the customer:
[i]I'm working with the S3 connectors, and for each ingest job, there is a section that says: "Select key fields to identify unique records".
[i]With the data that I have getting exported into the S3 bucket that Gainsight pulls to ingest, I have some duplicates, so I don't want those getting written as new rows in the upsert to the MDA.
[i]With the "Select key fields to identify unique records" function, I was hoping these fields would essentially combine to create a composite unique key (I have 3 fields chosen for this function) like you could create in a regular MySQL database.[i]However, when I run the ingest, I still get duplicate rows where all 3 values are the same between the duplicate rows.
The last development on this issue was that the client was going to try and remove these duplicate entries before ingest. What is the possibility of adding a duplicate checker (or something similar) into the UI for the S3 connector?
Sign up
If you ever had a profile with us, there's no need to create another one.
Don't worry if your email address has since changed, or you can't remember your login, just let us know at community@gainsight.com and we'll help you get started from where you left.
Else, please continue with the registration below.
Welcome to the Gainsight Community
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
As per team comments this is working as designed and they accepted this as an enhancement request,so changing this post to an idea type