Logo of contributing agency IDI Data Dictionary: IR tax data September 2015 edition Crown copyright © This work is licensed under the Creative Commons Attribution 3.0 New Zealand licence. You are free to copy, distribute, and adapt the work, as long as you attribute the work to Statistics NZ and abide by the other licence terms. Please note you may not use any departmental or governmental emblem, logo, or coat of arms in any way that infringes any provision of the Flags, Emblems, and Names Protection Act 1981. Use the wording ‘Statistics New Zealand’ in your attribution, not the Statistics NZ logo. Liability While all care and diligence has been used in processing, analysing, and extracting data and information in this publication, Statistics New Zealand gives no warranty it is error free and will not be liable for any loss or damage suffered by the use directly, or indirectly, of the information in this publication. Citation Statistics New Zealand (2015). IDI Data Dictionary: IR tax data (September 2015 edition). Available from www.stats.govt.nz. ISSN 2463-3615 (online) Published in September 2015 by Statistics New Zealand Tatauranga Aotearoa Wellington, New Zealand Contact Statistics New Zealand Information Centre: [email protected] Phone toll-free 0508 525 525 Phone international +64 4 931 4600 www.stats.govt.nz Contents 1 Purpose of this data dictionary ....................................................................................5 2 About the tax data .........................................................................................................6 Coverage .........................................................................................................................6 Methodology ....................................................................................................................6 Privacy, security, or confidentiality issues .......................................................................6 List of datasets.................................................................................................................6 3 Data dictionary for ird_ems ..........................................................................................7 Dataset description ..........................................................................................................7 Summary table .................................................................................................................7 Detailed information .........................................................................................................8 4 Data dictionary for ird_addresses ............................................................................15 Dataset description ........................................................................................................15 Summary table ...............................................................................................................15 Detailed information .......................................................................................................15 5 Data dictionary for ird_customers .............................................................................20 Dataset description ........................................................................................................20 Summary table ...............................................................................................................20 Detailed information .......................................................................................................20 6 Data dictionary for ird_client_names ........................................................................24 Dataset description ........................................................................................................24 Summary table ...............................................................................................................24 Detailed information .......................................................................................................24 7 Data dictionary for ird_tax_registrations ..................................................................27 Dataset description ........................................................................................................27 Summary table ...............................................................................................................27 Detailed information .......................................................................................................27 8 Data dictionary for ird_cross_reference ...................................................................31 Dataset description ........................................................................................................31 Summary table ...............................................................................................................31 Detailed information .......................................................................................................31 9 Data dictionary for ird_rtns_keypoints_ir3 ...............................................................34 Dataset description ........................................................................................................34 Summary table ...............................................................................................................34 Detailed information .......................................................................................................34 3 IDI Data Dictionary: IR tax data (September 2015 edition) 10 Data dictionary for ird_attachments_ir20 .................................................................38 Dataset description ........................................................................................................38 Summary table ...............................................................................................................38 Detailed information .......................................................................................................38 11 Data dictionary for ird_attachments_ir4s .................................................................41 Dataset description ........................................................................................................41 Summary table ...............................................................................................................41 Detailed information .......................................................................................................41 12 Data dictionary for ird_old_systems_numbers ........................................................44 Dataset description ........................................................................................................44 Summary table ...............................................................................................................44 Detailed information .......................................................................................................44 13 Glossary........................................................................................................................46 4 1 Purpose of this data dictionary IDI Data Dictionary: IR tax data (September 2015 edition) documents the content of the datasets the Inland Revenue (IR) provides to Statistics New Zealand to use in the Integrated Data Infrastructure (IDI). This document pulls together a number of documents that exist in relation to the IR tax data to create a ‘formalised’ central reference point for users. This dictionary gives information on the variables contained in the IR tax datasets from April 1999 – including technical information and descriptions. Use this data dictionary if you are interested in understanding and accessing the IR tax data in the IDI for your research. 5 2 About the tax data Coverage Reference period start: 1 April 1999 Reference period end: ongoing Geographic coverage: all New Zealand Methodology Type of data: administrative data capture. Data collector: Inland Revenue Frequency of data collection: supplied monthly to the IDI Privacy, security, or confidentiality issues In addition to the confidentiality clauses pertaining to all data held by Statistics New Zealand, the use of IR tax data is governed under conditions specified under the Memorandum of Understanding between Stats NZ and Inland Revenue as well as the conditions covered under the Tax Administration Act 1994. The IR tax datasets that are accessible to researchers do not contain any name or address information to identify an individual. All researchers who have access to the tax data have had their research proposals assessed using Statistics NZ’s microdata access protocols and only approved researchers who have been granted access by Statistics NZ and the Inland Revenue Department may view the tax data. Read Statistics NZ’s microdata access protocols. All outputs produced from tax data must be aggregated and counts suppressed if the underlying unrounded count is fewer than 6. List of datasets ird_ems ird_addresses ird_customers ird_client_names ird_tax_registrations ird_cross_reference ird_rtns_keypoints_ir3 ird_attachments_ir20 ird_attachments_ir4s ird_old_systems_numbers 6 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 3 Data dictionary for ird_ems Dataset description Contents of dataset: The employee level data from the EMS return for period dates from 1 April 1999. Conditions: Active records only, ie. ir_ems_return_line_item_code = 'A' Exclude records with gross earnings equal to 0. Note: Employers are able file late returns and/or amend EMS returns relating to prior periods. This means that in a given period, data may be updated with: (a) New Active data for the latest period – however, do not include records that were created and made inactive within the same period. (b) New data relating to prior periods – include new data that has been submitted to Inland Revenue, but relates to prior periods. (c) Revisions relating to prior periods – include changes/revisions to data already supplied. Summary table IDI variable name Primary key Mandatory Format snz_uid Y Y N snz_ird_uid Classification name Source variable name N N employee_ird_number snz_employer_ird_uid Y Y N employer_ird_number ir_ems_employer_location_nbr Y Y 4N employer_location_number ir_ems_return_period_date Y Y Datetime return_period_date ir_ems_line_nbr Y Y 6N line_number ir_ems_snz_unique_nbr Y Y N ir_ems_version_nbr Y Y 6N version_number ir_ems_doc_lodge_prefix_nbr Y Y 1N doc_lodge_nbr_prefix l ir_ems_doc_lodge_nbr Y Y 9N doc_lodge_nbr ir_ems_doc_lodge_suffix_nbr Y Y 2N doc_lodge_nbr_suffix ir_ems_gross_earnings_amt N 13.2N gross_earnings_amount ir_ems_gross_earnings_imp_co de Y 1A gross_earnings_imp_code ir_ems_paye_deductions_amt N 13.2N paye_deductions_amount ir_ems_paye_imp_ind Y 1A paye_imp_ind ir_ems_earnings_not_liable_am t N 13.2N earnings_not_liable_amount ir_ems_earnings_not_liab_imp_ ind Y 1A earnings_not_liab_imp_ind ir_ems_fstc_amt N 13.2N ftsc_amount ir_ems_sl_amt N 13.2N sl_amount 7 IDI Data Dictionary: IR tax data (September 2015 edition) IDI variable name Primary key Mandatory Format Classification name ir_ems_withholding_type_code Y 1A withholding_type_code ir_ems_income_source_code Y 3A income_source_code ir_ems_employee_start_date N Datetime date_employee_started ir_ems_employee_end_date N Datetime date_employee_finished ir_ems_lump_sum_ind N 1A lump_sum_indicator ir_ems_tax_code Y 6A ir_ems_return_line_item_code Y 1A return_line_item_status_cod e ir_ems_processed_date y Datetime date_processed ir_ems_ird_timestamp_date Y Datetime timestamp ir_ems_enterprise_nbr N 10A ir_ems_pbn_nbr N 10A tax_codes Source variable name tax_code Detailed information ______________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _________________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (ird number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _________________________________________ Variable name: snz_employer_ird_uid Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N 8 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: Notes: _________________________________________ Variable name: ir_ems_employer_location_nbr Definition: A location number is a sequence number that identifies/distinguishes between the associated locations that have return filing obligations that a customer may have. Format: Numeric, 9N Name of classification: Notes: _______________________________________ Variable name: ir_ems_return_period_date Definition: Period covered by the return. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_ems_line_nbr Definition: A line item number is a sequence number used to identify the different line items on a return attachment eg it is incremented from 1 by 1 for each line item. Format: Numeric, 6N Name of classification: Notes: _______________________________________ Variable name: ir_ems_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ 9 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_ems_version_nbr Definition: A version number is a means of distinguishing one version of a return attachment line item from another. The version number is initialised at zero then incremented from 1 by 1 each time the record is changed. Format: Numeric, 6N Name of classification: Notes: _______________________________________ Variable name: ir_ems_doc_lodge_prefix_nbr Definition: The prefix of the document lodgement number under which this schedule (or EMS) was filed. A prefix of 3 indicates a manual return, a prefix of 8 indicates an e-filed return. Format: Numeric, 1N Name of classification: Notes: _______________________________________ Variable name: ir_ems_doc_lodge_nbr Definition: The document lodgement number (DLN) is a unique number assigned to documents or returns lodged. Format: Numeric, 9N Name of classification: Notes: _______________________________________ Variable name: ir_ems_doc_lodge_suffix_nbr Definition: Suffix to document lodgement number. Format: Numeric, 2N Name of classification: Notes: _______________________________________ Variable name: ir_ems_gross_earnings_amt Definition: Total earnings before tax deducted. The gross earnings paid to the employee. The EMS may include more than one line item entry. Format: Numeric, 13.2 Name of classification: 10 IDI Data Dictionary: IR tax data (September 2015 edition) Notes: _______________________________________ Variable name: ir_ems_gross_earnings_imp_code Definition: Format: 1A Name of classification: Notes: _______________________________________ Variable name: ir_ems_paye_deductions_amt Definition: Total income tax deductions. Format: Numeric, 13.2 Name of classification: Notes: This includes withholding payments _______________________________________ Variable name: ir_ems_paye_imp_ind. Definition: Format: 1A Name of classification: Notes: _______________________________________ Variable name: ir_ems_earnings_not_liable_amt Definition: Income not liable for ACC earner premium. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ Variable name: ir_ems_earnings_not_liab_imp_ind Definition: Format: 1A Name of classification: Notes: 11 IDI Data Dictionary: IR tax data (September 2015 edition) _______________________________________ Variable name: ir_ems_fstc_amt Definition: Family Support Tax Credit – the amount of family support paid to each WINZ beneficiary for the line item. This column only applies to NZISS customers. FSTC is on DWI (WINZ) EMS schedules only as DWI are the only (beneficiary) ‘employer’ to fill in this column, so it does not appear on the standard EMS form. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ Variable name: ir_ems_sl_amt Definition: Student loan repayments – student loan deduction amount for the line item. The amount is always displayed as a negative number. The student loan amount is then subtracted from the total student loan. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ Variable name: ir_ems_withholding_type_code Definition: P for PAYE deductions, W for withholding tax deductions. Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_ems_income_source_code Definition: Code representing the source of income. Format: Character, 3A Name of classification: W&S – wages and salary, WHP – withholding payment, BEN – benefits, STU – Student Allowance, PPL – Paid Parental Leave, PEN – Pensions (superannuation), CLM – Claimants Compensation. Notes: _______________________________________ Variable name: ir_ems_employee_start_date Definition: Start date of the employee. Is entered by the employer on the EMS. 12 IDI Data Dictionary: IR tax data (September 2015 edition) Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_ems_employee_end_date Definition: End date of the employee. Is entered by the employer on the EMS. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_ems_lump_sum_ind Definition: Flag to indicate a lump sum payment. Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_ems_tax_code Definition: Tax code of employee. The tax code at which deductions have been made for the employee for this line item number eg 'M' main source of income. Only one job can have this code at any one time. Format: Character, 6A Name of classification: tax_codes Notes: _______________________________________ Variable name: ir_ems_return_line_item_code Definition: Status code. A code is an abbreviation for a return line item status. Status values are 'A' active or 'I' inactive. Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_ems_processed_date Definition: Process date. 13 IDI Data Dictionary: IR tax data (September 2015 edition) Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_ems_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ Variable name: ir_ems_enterprise_nbr Definition: A unique identifier generated by Statistics NZ for an enterprise. An enterprise is an institutional unit and generally corresponds to legal entities operating in New Zealand. It can be a company, partnership, trust, estate, incorporated society, producer board, local or central government organisation, voluntary organisation, or self-employed individual. Format: 10A Name of classification: Notes: _______________________________________ Variable name: ir_ems_pbn_nbr Definition: Permanent Business Number. 10-character code, consisting of 'PB' prefix, followed by a unique 8-digit number. This is a Statistics NZ generated construct for a geographically located business unit. Format: 10A Name of classification: Notes: 14 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 4 Data dictionary for ird_addresses Dataset description Contents of dataset: This table contains geocoded address information for an individual. Summary table IDI variable name Primary key snz_uid Mandatory Format Y N snz_ird_uid Y Y N ir_apc_location_nbr Y Y 4N ir_apc_address_type_code Y Y 1A ir_apc_snz_unique_nbr N N ir_apc_applied_date Y Datetime Classification name Source variable name ird_number location_number address_types address_type date_applied ir_apc_tax_type_code Y Y 3A tax_types ir_apc_main_address_ind Y Y 1A main_address_indicator ir_apc_post_code N 6A post_code ir_apc_address_status_code N 1A ir_apc_ceased_date N Datetime date_ceased ir_apc_ird_timestamp_date Y Datetime timestamp ir_apc_region_code N 2A ir_apc_ta_code N 3A ir_apc_meshblock_code N 7A ir_apc_meshblock_imputed_ind N 1A snz_idi_address_register_uid N N address_status tax_type address_status Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: 7N Name of classification: Notes: _______________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. 15 IDI Data Dictionary: IR tax data (September 2015 edition) Format: N Name of classification: Notes: _______________________________________ Variable name: ir_apc_location_nbr Definition: Location number of the EMS filer (payroll system) Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_apc_address_type_code Definition: Type of address a client may have e.g. 'L' - Physical Location Address, 'P'Postal address, 'R' -Registered Office, 'S' - Specific address, etc. Format: Character, 1A Name of classification: address_types Notes: _______________________________________ Variable name: ir_apc_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_apc_applied_date Definition: Date from which record became valid Format: Datetime, dd/mm/yy Name of classification: Notes: _______________________________________ Variable name: ir_apc_tax_type_code Definition: Tax code. 16 IDI Data Dictionary: IR tax data (September 2015 edition) Format: Character, 3A Name of classification: tax_types Notes: _______________________________________ Variable name: ir_apc_main_address_ind Definition: Y/N indicator that denotes whether the address is the client's main address. A client may have more than one main address. Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_apc_post_code Definition: This is a numeric code that has been assigned by the NZ Post for an area within New Zealand and is used for the delivery of mail Format: Character, 6A Name of classification: Notes: In the post code field approximately 90 percent of data is available. _______________________________________ Variable name: ir_apc_address_status_code Definition: Current address status of customer, eg 'D' return to district office, 'I' invalid address, 'O' overseas address, 'V' valid address etc Format: Character, 1A Name of classification: address_status Notes: _______________________________________ Variable name: ir_apc_ceased_date Definition: Date from which record ceased to be valid Format: Datetime, dd/mm/yy Name of classification: Notes: _______________________________________ 17 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_apc_ird_timestamp_date Definition: Indicates when data was extracted from into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ Variable name: ir_apc_region_code Definition: Format: 2A Name of classification: Notes: _______________________________________ Variable name: ir_apc_ta_code Definition: Format: 3A Name of classification: Notes: _______________________________________ Variable name: ir_apc_meshblock_code Definition: A seven digit mesh block number which is the lowest level of a customer's geographic location. Format: 7A Name of classification: Notes: _______________________________________ Variable name: ir_apc_meshblock_imputed_ind Definition: Format: 1A Name of classification: Notes: _______________________________________ 18 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: snz_idi_address_register_uid Definition: Format: N Name of classification: Notes: _______________________________________ 19 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 5 Data dictionary for ird_customers Dataset description Contents of dataset: This table holds birth_month, birth_year, and entity_type. Summary table IDI variable name Primary key Mandatory Format Classification name snz_uid Y Y N snz_ird_uid Y Y N ir_cus_snz_unique_nbr Y Y N ir_cus_location_nbr Y Y 4N ir_cus_entity_type_code Y 1A entity_types entity_type ir_cus_entity_class_code Y 2A entity_classes entity_class ir_cus_client_status_code Y 1A client_status client_status ir_cus_applied_date N Datetime date_applied ir_cus_ceased_date N Datetime date_ceased ir_cus_birth_year_nbr N 4N date_of_birth ir_cus_birth_month_nbr N 2N date_of_birth Source variable name ird_number location_number ir_cus_org_commencement_dat e N Datetime org_commencement_dat e ir_cus_loan_indicator_code N 1A loan_indicator ir_cus_resident_indicator_code N 1A resident_indicator ir_cus_sic_code N 8A sic_codes sic_code Detailed information _______________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _______________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N 20 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: Notes: _______________________________________ Variable name: ir_cus_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_cus_location_nbr Definition: Location number of taxpayer. Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_cus_entity_type_code Definition: Type of entity eg C = company, M = Māori authority, P = partnership, I= individual etc. Format: Character, 1A Name of classification: entity_types Notes: _______________________________________ Variable name: ir_cus_entity_class_code Definition: Class of entity eg BS = Building Society, UT = unit trust, SW = salary or wages etc. Format: Character, 2A Name of classification: entity_classes Notes: _______________________________________ Variable name: ir_cus_client_status_code Definition: Status of the client eg C = ceased, B = bankrupt, A = active, L = liquidation, R = receivership, M = amalgamated company, S = struck off, U = undischarged bankrupt. Format: Character, 1A 21 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: client_status Notes: e.g. active/bankrupt/ceased active _______________________________________ Variable name: ir_cus_applied_date Definition: Date from which the record became active. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_cus_ceased_date Definition: Date from which the record became inactive. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_cus_birth_year_nbr Definition: Format: 4N Name of classification: Notes: _______________________________________ Variable name: ir_cus_birth_month_nbr Definition: Format: 2N Name of classification: Notes: _______________________________________ Variable name: ir_cus_org_commencement_date Definition: Commencement date for any entity other than an individual, ie company, partnership, trust etc. Loan transfer date. Format: Datetime, yyyymmdd Name of classification: Notes: May be set to 1/1/1970 if unknown. 22 IDI Data Dictionary: IR tax data (September 2015 edition) _______________________________________ Variable name: ir_cus_loan_indicator_code Definition: ‘Y’ indicates presence of student loan for individuals. Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_cus_resident_indicator_code Definition: NZ resident / non-resident for tax purposes (R/N) Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_cus_sic_code Definition: Industry Code, eg 511010 = supermarkets, 523100 = furniture retailing. Format: Character, 8A Name of classification: sic_codes Notes: _______________________________________ 23 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 6 Data dictionary for ird_client_names Dataset description Contents of dataset: This table holds sex and client status information. Summary table Primary key Manda- snz_uid Y Y N snz_ird_uid Y Y N Y N IDI variable name ir_cli_snz_unique_nbr Format tory Classification name Source variable name ird_number ir_cli_location_nbr Y N 4N ir_cli_client_name_type _code Y N 2A ir_cli_sequence_nbr Y N 3N sequence_number ir_cli_applied_date Y N Datetime date_applied ir_cli_sex_snz_code N 1A ir_cli_sex_imp_code Y 1A ir_cli_ceased_date N Datetime date_ceased ir_cli_ird_timestamp_da te N Datetime timestamp location_number client_name_type client_name_type Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _________________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _________________________________________ 24 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_cli_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_cli_location_nbr Definition: Location number of the EMS filer (payroll system) Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_cli_client_name_type_code Definition: A code denoting the client name type eg P = preferred name, S = secondary name etc. Format: Character, 2A Name of classification: client_name_type Notes: _______________________________________ Variable name: ir_cli_sequence_nbr Definition: A (sequence) number is the numeric code given to each of a client's names within the combination of IRD number, location number and client name type. It is not a serial number, but duplicates the code in the client name type entity. There is a 1:1 relationship between client name number and client name type code: No. Code 10 = P 20 = S 30 = A 40 = C 50 = T. Format: Numeric, 3N Name of classification: Notes: _______________________________________ Variable name: ir_cli_applied_date Definition: Date from which the record became valid. Format: Datetime, yyyymmdd Name of classification: Notes: 25 IDI Data Dictionary: IR tax data (September 2015 edition) _______________________________________ Variable name: ir_cli_sex_snz_code Definition: Format: 1A Name of classification: Notes: _______________________________________ Variable name: ir_cli_sex_imp_code Definition: Format: 1A Name of classification: Notes: _______________________________________ Variable name: ir_cli_ceased_date Definition: Date from which the record became invalid. Format: Datetime, yyyymmdd Name of classification: Notes: new name, death etc. _______________________________________ Variable name: ir_cli_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _________________________________________ 26 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 7 Data dictionary for ird_tax_registrations Dataset description Contents of dataset: This table holds information about tax types. Summary table Primary key Manda- snz_uid Y Y N snz_ird_uid Y Y N ird_number ir_treg_location_nbr Y Y 4N location_number ir_treg_tax_type_code Y Y 3A ir_treg_applied_date Y Y Datetime date_applied ir_treg_snz_unique_nbr Y Y N ir_treg_snz_unique_nbr ir_treg_treg_start_date Y Y Datetime treg_date_start ir_treg_treg_end_date Y Datetime treg_date_end ir_treg_filing_frequency_ code N 2A tax_filing_freq uency filing_frequency ir_treg_treg_status_code N 1A tax_reg_status treg_status ir_treg_ceased_date N Datetime ir_treg_posting_ind_code N 1A ir_treg_electronic_filing_i nd N 1A electronic_filing_ind ir_treg_corporate_filing_i nd N 1A corporate_filing_ind ir_treg_has_agent_ind Y 1A has_agent_ind ir_treg_ird_timestamp_da te Y Datetime timestamp IDI variable name Format tory Classification name tax_types Source variable name tax_type date_ceased posting_indicat ors posting_ind Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _______________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across 27 IDI Data Dictionary: IR tax data (September 2015 edition) refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _______________________________________ Variable name: ir_treg_location_nbr Definition: Location number of the EMS filer (payroll system) Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_treg_tax_type_code Definition: Tax type Format: Character, 3A Name of classification: tax_types _______________________________________ Variable name: ir_treg_applied_date Definition: Date from which the record became active. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_treg_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_treg_treg_start_date Definition: Date the client first registered for a particular tax type Format: Datetime, yyyymmdd 28 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: Notes: _______________________________________ Variable name: ir_treg_treg_end_date Definition: Date the client deregistered for a particular tax type. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_treg_filing_frequency_code Definition: eg D = twice monthly, Q = quarterly, I = irregularly Format: Character, 2A Name of classification: tax_filing_frequency Notes: _______________________________________ Variable name: ir_treg_treg_status_code Definition: Active/Ceased 'X' Unknown Format: Character, 1A Name of classification: tax_reg_status Notes: _______________________________________ Variable name: ir_treg_ceased_date Definition: Date from which the record became invalid. Format: Datetime, yyyymmdd Name of classification: Notes: new name, death etc. _______________________________________ Variable name: ir_treg_posting_ind_code Definition: Distinguishes the type of address eg P = postal, Q = liquidator, A = agent etc. Format: Character, 1A Name of classification: posting_indicators 29 IDI Data Dictionary: IR tax data (September 2015 edition) Notes: _______________________________________ Variable name: ir_treg_electronic_filing_ind Definition: 'Y' if an electronic filer, 'N' if paper filer Format: Character, 1A Name of classification: Notes: _______________________________________ Variable name: ir_treg_corporate_filing_ind Definition: Indicates whether the customer is part of a corporate filing group. Format: Character, 1A Name of classification: Notes: 'N' = not part of a group, 'P' = parent, 'S' = subsidiary _______________________________________ Variable name: ir_treg_has_agent_ind Definition: Indicates whether a tax agent acts on behalf of the customer. Format: Character, 1A Name of classification: Notes: 'Y' = yes, 'N' = no. _______________________________________ Variable name: ir_treg_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _________________________________________ 30 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 8 Data dictionary for ird_cross_reference Dataset description Contents of dataset: This table is maintained by IR and holds information about the set of relationships between two IRD numbers. Most of the information on this table is found when the annual returns are processed. As most of the other information is found on annual returns, it’s only when the returns are processed that information may be validated. However, it may not always occur. Summary table Primary key Manda- snz_uid Y Y N ir_xrf_from_snz_ird_uid Y Y N ird_number_from ir_xrf_to_snz_ird_uid Y Y N ird_number_to ir_xrf_applied_date Y Y Datetime date_applied N Datetime date_ceased Y 3A N 4N first_year Y 4N latest_year Y Datetime timestamp IDI variable name ir_xrf_ceased_date ir_xrf_reference_type_code Y ir_xrf_first_year_nbr ir_xrf_latest_year_nbr Y ir_xrf_ird_timestamp_date Format tory Classification name cross_referenc e_types Source variable name reference_type Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _________________________________________ Variable name: ir_xrf_from_snz_ird_uid Definition: Format: N Name of classification: Notes: _________________________________________ 31 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_xrf_to_snz_ird_uid Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_xrf_applied_date Definition: Date from which record is valid. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_xrf_ceased_date Definition: Date from which record is invalid. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_xrf_reference_type_code Definition: AAC Amalgd/Amalging Co ASS Associated Person BAN Bankrupt BEN Beneficiary DEC Deceased DEP Dependent DIR Director DUP Duplicate IRD No EOH Exec Office Holder GPR GENERAL PARTNER IGN NOMINATED ICA CO JVT Joint Venture LPR LIMITED PARTNER LQR LIQUIDATOR LTI LOOK-THROUGH INT LTO LOOK THROUGH OWNER NOM Nominated Company NOP NOMINEE NOR NOMINATOR 32 IDI Data Dictionary: IR tax data (September 2015 edition) NRC PTR SHR SPO SUB TEE TRA VAD NON RES CHLD SUPPT Partner Shareholder Spouse/Defacto Subsidiary Company Trustee TRANSITIONAL CLIEN VOLUNTARY ADMINIST Format: Character, 3A Name of classification: cross_reference_types Notes: eg shareholder/partner/bankrupt _______________________________________ Variable name: ir_xrf_first_year_nbr Definition: Start date of the cross reference relationship. Format: Numeric, Name of classification: Notes: eg shareholder/partner/bankrupt _______________________________________ Variable name: ir_xrf_latest_year_nbr Definition: Latest year of the cross reference relationship. Format: Numeric, Name of classification: Notes: eg shareholder/partner/bankrupt _______________________________________ Variable name: ir_xrf_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data .warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ 33 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 9 Data dictionary for ird_rtns_keypoints_ir3 Dataset description Contents of dataset: This table contains information for the active items which have non-zero partnership, self-employment, or shareholder salary income. Summary table Primary key Manda- snz_uid Y Y N snz_ird_uid Y Y N ird_number ir_ir3_location_nbr Y Y 4N location_number Y Datetime return_period_date Y N ir_ir3_tot_pship_income_amt N 13.2N total_partnership_income_808 ir_ir3_tot_sholder_salary_amt N 13.2N total_shareholder_salary_809 ir_ir3_net_profit_amt N 13.2N net_profit_702 ir_ir3_income_imp_ind Y 1A ir_ir3_net_rents_826_amt N 13.2N net_rents_826 ir_ir3_tot_wholding_paymnts_ amt N 13.2N tot_w_holding_payments_100 514 ir_ir3_tot_expenses_claimed_ amt N 13.2N total_expenses_claimed_1512 ir_ir3_gross_earnings_407_a mt N 13.2N gross_earnings_407 ir_ir3_ird_timestamp_date N Datetime timestamp IDI variable name ir_ir3_return_period_date ir_ir3_snz_unique_nbr Y Format tory Classification name Variable name Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _________________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N 34 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: Notes: _______________________________________ Variable name: ir_ir3_location_nbr Definition: Location number of the EMS filer (payroll system). Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_ir3_return_period_date Definition: Period covered by return. Format: Datetime, dd/mm/yy Name of classification: Notes: _______________________________________ Variable name: ir_ir3_snz_unique_nbr Definition: Format: N Name of classification: Notes: _______________________________________ Variable name: ir_ir3_tot_pship_income_amt Definition: Partnership income. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ Variable name: ir_ir3_tot_sholder_salary_amt Definition: Shareholder salary income. Format: Numeric, 13.2N Name of classification: 35 IDI Data Dictionary: IR tax data (September 2015 edition) Notes: _______________________________________ Variable name: ir_ir3_net_profit_amt Definition: Self-employment income. Format: Numeric, 13.2N Name of classification: Notes: ________________________________________ Variable name: ir_ir3_income_imp_ind Definition: Format: 1A Name of classification: Notes: ________________________________________ Variable name: ir_ir3_net_rents_826_amt Definition: Net rental income Format: Numeric, 13.2 Name of classification: Notes: ________________________________________ Variable name: ir_ir3_tot_wholding_paymnts_amt Definition: Total gross earnings (with withholding tax deducted at source). Format: Numeric, 13.2N Name of classification: Notes: ________________________________________ Variable name: ir_ir3_tot_expenses_claimed_amt Definition: Total expenses claimed. Format: Numeric, 13.2N Name of classification: Notes: 36 IDI Data Dictionary: IR tax data (September 2015 edition) _________________________________________ Variable name: ir_ir3_gross_earnings_407_amt Definition: Gross earnings with PAYE deducted at source. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ Variable name: ir_ir3_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ 37 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 10 Data dictionary for ird_attachments_ir20 Dataset description Contents of dataset: This table contains information for active items which have nonzero partnership income. Summary table Primary key Manda- snz_uid Y Y N snz_ird_uid Y Y N ird_number Y N employer_ird_number IDI variable name snz_employer_ird_uid Format tory Classification name Variable name ir_ir20_location_nbr Y Y 4N location_number ir_ir20_return_period_date Y return_period_date Y Datetime ir_ir20_snz_unique_nbr Y N ir_ir20_tot_share_of_inc_8 65_amt N 13.2N tot_share_of_inc_865_amt ir_ir20_income_imp_ind Y 1A income_imp_ind ir_ir20_ird_timestamp_date N Datetime timestamp Detailed information _______________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _______________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _______________________________________ Variable name: snz_employer_ird_uid 38 IDI Data Dictionary: IR tax data (September 2015 edition) Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _______________________________________ Variable name: ir_ir20_location_nbr Definition: Location number of the payer. Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_ir20_return_period_date Definition: The return period. Format: Datetime, yyyymmdd Name of classification: Notes: _________________________________________ Variable name: ir_ir20_snz_unique_nbr Definition: Format: N Name of classification: Notes: _________________________________________ Variable name: ir_ir20_tot_share_of_inc_865_amt Definition: Value of partnership income. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ 39 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_ir20_income_imp_ind Definition: Format: 1A Name of classification: Notes: _________________________________________ Variable name: ir_ir20_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ 40 Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure 11 Data dictionary for ird_attachments_ir4s Dataset description Contents of dataset: This table holds information about the active items which have non-zero shareholder income. Summary table IDI variable name snz_uid Primary key Manda- Y Y N Y N ird_number snz_ird_uid Format tory Classification name Source variable name snz_employer_ird_uid Y Y N employer_ird_number ir_ir4_location_nbr Y Y 4N location_number ir_ir4_return_period_date Y Y Datetime return_period_date ir_ir4_snz_unique_nbr Y Y N ir_ir4_tot_sholder_sal_809 _amt N 13.2N total_shareholder_salary_ 809 ir_ir4_income_imp_ind Y 1A income_imp_ind ir_ir4_ird_timestamp_date Y Datetime timestamp Detailed information _______________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _______________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _______________________________________ 41 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: snz_employer_ird_uid Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N Name of classification: Notes: _______________________________________ Variable name: ir_ir4_location_nbr Definition: Location number of the payer. Format: Numeric, 4N Name of classification: Notes: _______________________________________ Variable name: ir_ir4_return_period_date Definition: The return period. Format: Datetime, yyyymmdd Name of classification: Notes: _________________________________________ Variable name: ir_ir4_snz_unique_nbr Definition: Format: N Name of classification: Notes: _________________________________________ Variable name: ir_ir4_tot_sholder_sal_809_amt Definition: Value of shareholder salary. Format: Numeric, 13.2N Name of classification: Notes: _______________________________________ 42 IDI Data Dictionary: IR tax data (September 2015 edition) Variable name: ir_ir4_income_imp_ind Definition: Format: 1A Name of classification: Notes: _________________________________________ Variable name: ir_ir4_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _______________________________________ 43 12 Data dictionary for ird_old_systems_numbers Dataset description Contents of dataset: This table contains the mapping of IRD numbers from old system to the new system. Summary table Primary key Manda- snz_uid Y Y N ir_osn_old_snz_ird_uid Y Y N old_system_number snz_ird_uid Y N ird_number ir_osn_location_nbr Y N location_number IDI variable name ir_osn_applied_date Y Format tory Classification name Source variable name Y Datetime date_applied ir_osn_ceased_date N Datetime date_ceased ir_osn_ird_timestamp_date Y Datetime timestamp Detailed information _________________________________________ Variable name: snz_uid Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh. Format: N Name of classification: Notes: _________________________________________ Variable name: ir_osn_old_snz_ird_uid Definition: Format: N Name of classification: Notes: _________________________________________ Variable name: snz_ird_uid Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (ird number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change. Format: N 44 IDI Data Dictionary: IR tax data (September 2015 edition) Name of classification: Notes: _______________________________________ Variable name: ir_osn_location_nbr Definition: Location number of the payer. Format: Numeric, Name of classification: Notes: _________________________________________ Variable name: ir_osn_applied_date Definition: Date from which record is valid. Format: Datetime, yyyymmdd Name of classification: Notes: _______________________________________ Variable name: ir_osn_ceased_date Definition: Date from which record is invalid. Format: Datetime, yyyymmdd Name of classification: Notes: _________________________________________________ Variable name: ir_osn_ird_timestamp_date Definition: Indicates when data was extracted into Inland Revenue’s data warehouse. Format: Datetime, yyyymmdd Name of classification: _________________________________________________ 45 13 Glossary Term Definition IDI name (Stats NZ) The variable names in the IDI SQL database. Mandatory field (IR) Indicates a field which cannot be “null”. Primary key (Stats NZ) An identifier for a unique database item (may consist of a single item or multiple items in combination). 46
© Copyright 2026 Paperzz