Dedup : Splunk Documentation | Step-By-Step Process | Expert’s Top Picks

Dedup : Splunk Documentation | Step-By-Step Process | Expert’s Top Picks

Last updated on 17th Dec 2021, Blog, General

About author

Pradip Mehrotra (Senior Splunk SIEM Engineer )

Pradip Mehrotra is an sr Splunk SIEM Engineer with 7+ years of experience and he has specialist in an analytics-driven SIEM tool that collects, analyzes, IBM QRadar, and correlates high volumes of network and other machine data in real-time.

(5.0) | 19738 Ratings 1519

    Removes the events that contain an identical combination of values for the fields that you specify.

    • Introduction to Splunk Dedup
    • The functionality of Splunk Dedup
    • Differentiation between Uniq and Splunk Dedup commands
    • Usage of Splunk Dedup command
    • Lexicographical order
    • Dedup as filtering command
    • Different functions of Splunk Dedup filtering commands.
    • Sort_field options in Splunk Dedup
    • Example of Splunk Dedup command execution
    • Conclusion

    Subscribe For Free Demo


      Introduction to Splunk Dedup

      Splunk Dedup
      Splunk Dedup

      Splunk Dedup command eliminates all of the activities that presumes an same mixture of values for all of the fields the person specifies. The Dedup command in Splunk eliminates reproduction values from the end result and presentations handiest the maximum current log for a selected incident. The Splunk Dedup command will go back the primary key cost discovered for that precise seek keyword/field.

    Course Curriculum

    Learn Advanced Splunk Certification Training Course to Build Your Skills

    Weekday / Weekend BatchesSee Batch Details

      The functionality of Splunk Dedup

      By the usage of Splunk Dedup command, the consumer can specify the counts of duplication with admire to activities to preserve both for each price of unmarried filed or for mixtures of every price amongst numerous fields. The activities reverted with the aid of using Splunk Dedup are primarily based totally on seek order, In the case of ancient searches, the current happenings are searched primarily. At the equal time for real-time searches, the number one activities which are acquired are the searched activities which won’t always be the maximum current activities which took place. With the assist of Splunk Dedup, the consumer can solely specify the rely of activities with replica values, or price mixtures, to maintain. One can as properly type the fields that allows you to have a readability on which activities are being retained. Alternative alternatives in Splunk Dedup, permit the customers to maintain activities with the elimination of replica fields or maintain the activities in which the required fields do now no longer exist withinside the activities.

      Differentiation between Uniq and Splunk Dedup commands

      The principal capability of uniq instructions is to do away with duplicated records if the complete row or the occasion is similar. Whereas Dedup instructions awareness most effective on the mainly noted fields. For instance: If the consumer says, “| dedup host”, the Dedup command awareness on the host filed and maintains the primary from every host. In dedup instructions, you will specify severa fields and additionally has an alternative like consecutive, in which the Dedup command eliminates the activities with replica mixtures of values which might be consecutive in nature or maintain empty that keeps activities which do now no longer have the precise required field. The Uniq command eliminates any seek end result that is an precise duplication, so the activities must be restored so as to use it. On the alternative hand, the dedup command is pretty bendy not like uniq command, dedup command may be map-decreased and may be trimmed to a selected length defaulting to one and may be implemented to n range of fields on the equal factor of time.

      Usage of Splunk Dedup command

      One can keep away from the usage of Splunk Dedup command at the _raw discipline while looking over a big quantity of records, If this characteristic is completed the records of each occasion withinside the reminiscence can be retained which in stop consequences the searchability. In Splunk Dedup that is an predicted conduct and is implemented to any discipline with excessive cardinality and big size. For instance, if the person looked for all of the logs or values and implemented the Splunk dedup command for person identityentification discipline i.e. the dedup discipline then subsequently shows handiest one log or price for every uid.No log repetition takes region withinside the complete process.

      Lexicographical order

    • Lexicographical order features with the aid of using sorting the objects primarily based totally upon their values used to encode the objects withinside the tool memory. In Splunk software, that is objects primarily based totally at the values used to encode the objects in pc memory. In Splunk software, that is nearly usually encoding of UTF-8, a superset of ASCII. In Lexicographical order, the numbers are taken care of previous to the letters, and the previous are saved primarily based totally on the primary digit. For instance: the numbers 10, 9, 70, one hundred are taken care of lexicographically as 10, one hundred, 70, 9.
    • When coming to the alphabetical collection, the uppercases are taken care of earlier than the decrease cases. Symbols do now no longer observe any standardized of a technique in collection in Lexicographical order. They can both be taken care of earlier than numerical values or earlier than or after alphabetical values.
    Lexicographical order
    Lexicographical order

      Dedup as filtering command

      Dedup acts as filtering command, through taking seek outcomes from formerly finished command and decrease them to a smaller set of output. Removal of redundant statistics is the middle characteristic of dedup filtering command. Splunk Dedup eliminates output which suits to particular set criteria, that’s the command keeps most effective the number one remember outcomes for every aggregate of values of particular fields and if the remember is not specified, it defaults to one and returns the end result priory found.

      Different functions of Splunk Dedup filtering commands.

      There are separate instructions with recognize to Splunk Dedup filtering command for a selected situation. In the case of keeping all of the outcomes and doing away with simplest reproduction data, the consumer can execute preserve activities command. In case the outcomes reverted are the number one outcomes located with the aggregate of particular subject values that are typically the maximum latest ones, then the consumer can use the kind with the aid of using clause to alternate the collection of order if wished and subsequently if the fields wherein the desired filed do now no longer ec=xsit in any respect are retained with the aid of using default, consumer can use keepnull= choice to countermand the default behaviour if he dreams to.

      sort_field options in Splunk Dedup

      There are severa solt_filed alternatives with recognize to Dedup. Through the consumer can witness the precise alternatives on the way to type the events. is described because the call of the sector to type. Auto function determines routinely the method to type the sector values. Ip translates the sector fee as IP addresses and Num on the identical time translates the sector values as numerical. At ultimate ordering of discipline values through utilising lexicographic order is accomplished thru str.

      Example of Splunk Dedup command execution

      This instance showcases how dedup command is carried out. For instance, A person desires to organization all occasions with repetitive occurrences of a price to cast off those repetitions from reviews and alerts.Solution, supposedly the occasions are as follows :-

    • 2012-07-22 11:45:23 code=239
    • 2012-07-22 11:45:25 code=773
    • 2012-07-22 11:45:26 code=-1
    • 2012-07-22 11:45:27 code=-1
    • 2012-07-22 11:45:28 code=-1
    • 2012-07-22 11:45:29 code=292
    • 2012-07-22 11:45:30 code=292
    • 2012-07-22 11:45:32 code=-1
    • 2012-07-22 11:45:33 code=444
    • 2012-07-22 11:45:35 code=-1
    • 2012-07-22 11:45:36 code=-1
    • Here the final intention is to achieve seven occasions, one for every of the precise code values in a row: 239, 773, -1, 292, -1, 444, -1. Normally, on this case, the customers constantly take the incorrect step and execute transaction command ( … | transaction code ) respectively, however executing dedup command that is an awful lot extra trustworthy approach. This command will cast off all of the replica occasions however on this case , a command that could cast off duplicates that seem in a cluster is needed and hence, … | dedup code consecutive=authentic is carried out to attain the ideal result.

    Splunk Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download


      The programs in Splunkbase fill some, however now no longer all, of the distance among the command-line techie-pleasant searches you get out-of-the-field and what community managers have come to anticipate from present day programs. Of course, all of it relies upon on what organization programs you’ve got got walking. For example, Splunk has evolved a unfastened Splunkbase utility with enormous hooks into Microsoft Exchange, together with dashboards, message tracking, overall performance indicators, and capability planning. If you are walking Exchange, it is great. If you are on Domino, re-growing that cost might not be easy.

    Are you looking training with Right Jobs?

    Contact Us

    Popular Courses

    Get Training Quote for Free