Record matching Rules | Best Book Buddies
  1. Koha ManualKoha Manual
  2. Koha Administration
  3. How to set catalog controls?

Koha Administration

How to set catalog controls?

Set these controls before you start cataloging on your Koha system.

Joy Nelson

ByWater Solutions

Edited by

Nicole C. Engard

Changed/edited content where necessary.

2013

Record matching rules are used when importing MARC records into Koha.

  • Get there: More > Administration > Catalog > Record Matching Rules

The rules that you set up here will be referenced with you Stage MARC Records for Import.

It is important to understand the difference between Match Points and Match Checks before adding new matching rules to Koha.

Match Points are the criteria that you enter that must be met in order for an incoming record to match an existing MARC record in your catalog. You can have multiple match points on an import rule each with its own score. An incoming record will be compared against your existing records (‘one record at a time’) and given a score for each match point. When the total score of the matchpoints matches or exceeds the threshold given for the matching rule, Koha assumes a good match and imports/overlays according your specifications in the import process. An area to watch out for here is the sum of the match points. Doublecheck that the matches you want will add up to a successful match.

Example:

Threshold of 1000

Match Point on 020$a 1000

Match Point on 022$a 1000

Match Point on 245$a 500

Match Point on 100$a 100

In the example above, a match on either the 020$a or the 022$a will result in a successful match. A match on 245$a title and 100$a author (and not on 020$a or 022$a) will only add up to 600 and not be a match. And a match on 020$a and 245$a will result in 1500 and while this is a successful match, the extra 500 point for the 245$a title match are superfluous. The incoming record successfully matched on the 020$a without the need for the 245$a match. However, if you assigned a score of 500 to the 100$a Match Point, a match on 245$a title and 100$a author will be considered a successful match (total of 1000) even if the 020$a is not a match.

Match Checks are not commonly used in import rules. However, they can serve a couple of purposes in matching records. First, match checks can be used as the matching criteria instead of the match points if your indexes are stale and out of date. The match checks go right for the data instead of relying on the data in the indexes. (If you fear your indexes are out of date, a rebuild of your indexes would be a great idea and solve that situation!) The other use for a Match Check is as a “double check” or “veto” of your matching rule. For example, if you have a matching rule as below:

Threshold of 1000

Match Point on 020$a 1000

Match Check on 245$a

Koha will first look at the 020$a tag/subfield to see if the incoming record matches an existing record. If it does, it will then move on to the Match Check and look directly at the 245$a value in the incoming data and compare it to the 245$a in the existing ‘matched’ record in your catalog. If the 245$a matches, Koha continues on as if a match was successful. If the 245$a does not match, then Koha concludes that the two records are not a match after all. The Match Checks can be a really useful tool in confirming true matches.

When looking to create matching rules for your authority records the following indexes will be of use:

Table 2.1. Authority Indexes

Index name Matches Marc Tag
LC-cardnumber 010$a
Personal-name 100$a
Corporate-name-heading 110$a
Meeting-name 111$a
Title-uniform 130$a
Chronological-term 148$a
Subject-topical 150$a
Name-geographic 151$a
Term-genre-form 155$a

To create a new matching rule :

  • Click 'New Record Matching Rule'

    newmatchrule
    • Choose a unique name and enter it in the 'Matching rule code' field

    • 'Description' can be anything you want to make it clear to you what rule you're picking

    • 'Match threshold' is the total number of 'points' a biblio must earn to be considered a 'match'

    • 'Record type' is the type of import this rule will be used for - either authority or bibliographic

    • Match points are set up to determine what fields to match on

    • 'Search index' can be found by looking at the ccl.properties file on your system which tells the zebra indexing what data to search for in the MARC data".

    • 'Score' - The number of 'points' a match on this field is worth. If the sum of each score is greater than the match threshold, the incoming record is a match to the existing record

    • Enter the MARC tag you want to match on in the 'Tag' field

    • Enter the MARC tag subfield you want to match on in the 'Subfields' field

    • 'Offset' - For use with control fields, 001-009

    • 'Length' - For use with control fields, 001-009

    • Koha only has one 'Normalization rule' that removes extra characters such as commas and semicolons. The value you enter in this field is irrelevant to the normalization process.

    • 'Required match checks' - ??

samplematchrule
  • Match threshold: 100

  • Record type: Bibliographic

      Note

      If you'd like a rule to match on the 001 in authority records you will need the repeat all of these values and change just the record type to 'Authority record'

  • Matchpoints (just the one):

  • Search index: Control-number

  • Score: 101

  • Tag: 001

      Note

      this field is for the control number assigned by the organization creating, using, or distributing the record

  • Subfields: a

  • Offset: 0

  • Length: 0

  • Normalization rule: Control-number

  • Required Match checks: none (remove the blank one)

    removematchcheck

"You are seeing this manual, thanks to Koha Community"