IDI Data Dictionary: IR tax data

Logo of contributing agency
IDI Data Dictionary:
IR tax data
September 2015 edition
Crown copyright ©
This work is licensed under the Creative Commons Attribution 3.0 New Zealand licence.
You are free to copy, distribute, and adapt the work, as long as you attribute the work to
Statistics NZ and abide by the other licence terms. Please note you may not use any
departmental or governmental emblem, logo, or coat of arms in any way that infringes any
provision of the Flags, Emblems, and Names Protection Act 1981. Use the wording
‘Statistics New Zealand’ in your attribution, not the Statistics NZ logo.
Liability
While all care and diligence has been used in processing, analysing, and extracting data
and information in this publication, Statistics New Zealand gives no warranty it is error free
and will not be liable for any loss or damage suffered by the use directly, or indirectly, of the
information in this publication.
Citation
Statistics New Zealand (2015). IDI Data Dictionary: IR tax data (September 2015 edition).
Available from www.stats.govt.nz.
ISSN 2463-3615 (online)
Published in September 2015 by
Statistics New Zealand
Tatauranga Aotearoa
Wellington, New Zealand
Contact
Statistics New Zealand Information Centre: [email protected]
Phone toll-free 0508 525 525
Phone international +64 4 931 4600
www.stats.govt.nz
Contents
1 Purpose of this data dictionary ....................................................................................5
2 About the tax data .........................................................................................................6
Coverage .........................................................................................................................6
Methodology ....................................................................................................................6
Privacy, security, or confidentiality issues .......................................................................6
List of datasets.................................................................................................................6
3 Data dictionary for ird_ems ..........................................................................................7
Dataset description ..........................................................................................................7
Summary table .................................................................................................................7
Detailed information .........................................................................................................8
4 Data dictionary for ird_addresses ............................................................................15
Dataset description ........................................................................................................15
Summary table ...............................................................................................................15
Detailed information .......................................................................................................15
5 Data dictionary for ird_customers .............................................................................20
Dataset description ........................................................................................................20
Summary table ...............................................................................................................20
Detailed information .......................................................................................................20
6 Data dictionary for ird_client_names ........................................................................24
Dataset description ........................................................................................................24
Summary table ...............................................................................................................24
Detailed information .......................................................................................................24
7 Data dictionary for ird_tax_registrations ..................................................................27
Dataset description ........................................................................................................27
Summary table ...............................................................................................................27
Detailed information .......................................................................................................27
8 Data dictionary for ird_cross_reference ...................................................................31
Dataset description ........................................................................................................31
Summary table ...............................................................................................................31
Detailed information .......................................................................................................31
9 Data dictionary for ird_rtns_keypoints_ir3 ...............................................................34
Dataset description ........................................................................................................34
Summary table ...............................................................................................................34
Detailed information .......................................................................................................34
3
IDI Data Dictionary: IR tax data (September 2015 edition)
10 Data dictionary for ird_attachments_ir20 .................................................................38
Dataset description ........................................................................................................38
Summary table ...............................................................................................................38
Detailed information .......................................................................................................38
11 Data dictionary for ird_attachments_ir4s .................................................................41
Dataset description ........................................................................................................41
Summary table ...............................................................................................................41
Detailed information .......................................................................................................41
12 Data dictionary for ird_old_systems_numbers ........................................................44
Dataset description ........................................................................................................44
Summary table ...............................................................................................................44
Detailed information .......................................................................................................44
13 Glossary........................................................................................................................46
4
1
Purpose of this data dictionary
IDI Data Dictionary: IR tax data (September 2015 edition) documents the content of the
datasets the Inland Revenue (IR) provides to Statistics New Zealand to use in the
Integrated Data Infrastructure (IDI). This document pulls together a number of documents
that exist in relation to the IR tax data to create a ‘formalised’ central reference point for
users.
This dictionary gives information on the variables contained in the IR tax datasets from
April 1999 – including technical information and descriptions.
Use this data dictionary if you are interested in understanding and accessing the IR tax
data in the IDI for your research.
5
2
About the tax data
Coverage
Reference period start: 1 April 1999
Reference period end: ongoing
Geographic coverage: all New Zealand
Methodology
Type of data: administrative data capture.
Data collector: Inland Revenue
Frequency of data collection: supplied monthly to the IDI
Privacy, security, or confidentiality issues
In addition to the confidentiality clauses pertaining to all data held by Statistics New
Zealand, the use of IR tax data is governed under conditions specified under the
Memorandum of Understanding between Stats NZ and Inland Revenue as well as the
conditions covered under the Tax Administration Act 1994.
The IR tax datasets that are accessible to researchers do not contain any name or
address information to identify an individual. All researchers who have access to the tax
data have had their research proposals assessed using Statistics NZ’s microdata access
protocols and only approved researchers who have been granted access by Statistics NZ
and the Inland Revenue Department may view the tax data.
Read Statistics NZ’s microdata access protocols.
All outputs produced from tax data must be aggregated and counts suppressed if the
underlying unrounded count is fewer than 6.
List of datasets
ird_ems
ird_addresses
ird_customers
ird_client_names
ird_tax_registrations
ird_cross_reference
ird_rtns_keypoints_ir3
ird_attachments_ir20
ird_attachments_ir4s
ird_old_systems_numbers
6
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
3
Data dictionary for ird_ems
Dataset description
Contents of dataset: The employee level data from the EMS return for period dates from
1 April 1999.
Conditions:
 Active records only, ie. ir_ems_return_line_item_code = 'A'
 Exclude records with gross earnings equal to 0.
Note:
Employers are able file late returns and/or amend EMS returns relating to prior periods.
This means that in a given period, data may be updated with:
(a) New Active data for the latest period – however, do not include records that were
created and made inactive within the same period.
(b) New data relating to prior periods – include new data that has been submitted to
Inland Revenue, but relates to prior periods.
(c) Revisions relating to prior periods – include changes/revisions to data already
supplied.
Summary table
IDI variable name
Primary
key
Mandatory
Format
snz_uid
Y
Y
N
snz_ird_uid
Classification
name
Source variable name
N
N
employee_ird_number
snz_employer_ird_uid
Y
Y
N
employer_ird_number
ir_ems_employer_location_nbr
Y
Y
4N
employer_location_number
ir_ems_return_period_date
Y
Y
Datetime
return_period_date
ir_ems_line_nbr
Y
Y
6N
line_number
ir_ems_snz_unique_nbr
Y
Y
N
ir_ems_version_nbr
Y
Y
6N
version_number
ir_ems_doc_lodge_prefix_nbr
Y
Y
1N
doc_lodge_nbr_prefix l
ir_ems_doc_lodge_nbr
Y
Y
9N
doc_lodge_nbr
ir_ems_doc_lodge_suffix_nbr
Y
Y
2N
doc_lodge_nbr_suffix
ir_ems_gross_earnings_amt
N
13.2N
gross_earnings_amount
ir_ems_gross_earnings_imp_co
de
Y
1A
gross_earnings_imp_code
ir_ems_paye_deductions_amt
N
13.2N
paye_deductions_amount
ir_ems_paye_imp_ind
Y
1A
paye_imp_ind
ir_ems_earnings_not_liable_am
t
N
13.2N
earnings_not_liable_amount
ir_ems_earnings_not_liab_imp_
ind
Y
1A
earnings_not_liab_imp_ind
ir_ems_fstc_amt
N
13.2N
ftsc_amount
ir_ems_sl_amt
N
13.2N
sl_amount
7
IDI Data Dictionary: IR tax data (September 2015 edition)
IDI variable name
Primary
key
Mandatory
Format
Classification
name
ir_ems_withholding_type_code
Y
1A
withholding_type_code
ir_ems_income_source_code
Y
3A
income_source_code
ir_ems_employee_start_date
N
Datetime
date_employee_started
ir_ems_employee_end_date
N
Datetime
date_employee_finished
ir_ems_lump_sum_ind
N
1A
lump_sum_indicator
ir_ems_tax_code
Y
6A
ir_ems_return_line_item_code
Y
1A
return_line_item_status_cod
e
ir_ems_processed_date
y
Datetime
date_processed
ir_ems_ird_timestamp_date
Y
Datetime
timestamp
ir_ems_enterprise_nbr
N
10A
ir_ems_pbn_nbr
N
10A
tax_codes
Source variable name
tax_code
Detailed information
______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (ird number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_employer_ird_uid
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
8
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification:
Notes:
_________________________________________
Variable name: ir_ems_employer_location_nbr
Definition:
A location number is a sequence number that identifies/distinguishes between the
associated locations that have return filing obligations that a customer may have.
Format: Numeric, 9N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_return_period_date
Definition: Period covered by the return.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_line_nbr
Definition: A line item number is a sequence number used to identify the different line
items on a return attachment eg it is incremented from 1 by 1 for each line item.
Format: Numeric, 6N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
9
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_ems_version_nbr
Definition: A version number is a means of distinguishing one version of a return
attachment line item from another. The version number is initialised at zero then
incremented from 1 by 1 each time the record is changed.
Format: Numeric, 6N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_prefix_nbr
Definition: The prefix of the document lodgement number under which this schedule (or
EMS) was filed. A prefix of 3 indicates a manual return, a prefix of 8 indicates an e-filed
return.
Format: Numeric, 1N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_nbr
Definition: The document lodgement number (DLN) is a unique number assigned to
documents or returns lodged.
Format: Numeric, 9N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_suffix_nbr
Definition: Suffix to document lodgement number.
Format: Numeric, 2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_gross_earnings_amt
Definition: Total earnings before tax deducted. The gross earnings paid to the employee.
The EMS may include more than one line item entry.
Format: Numeric, 13.2
Name of classification:
10
IDI Data Dictionary: IR tax data (September 2015 edition)
Notes:
_______________________________________
Variable name: ir_ems_gross_earnings_imp_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_paye_deductions_amt
Definition: Total income tax deductions.
Format: Numeric, 13.2
Name of classification:
Notes: This includes withholding payments
_______________________________________
Variable name: ir_ems_paye_imp_ind.
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_earnings_not_liable_amt
Definition: Income not liable for ACC earner premium.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_earnings_not_liab_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
11
IDI Data Dictionary: IR tax data (September 2015 edition)
_______________________________________
Variable name: ir_ems_fstc_amt
Definition: Family Support Tax Credit – the amount of family support paid to each WINZ
beneficiary for the line item. This column only applies to NZISS customers. FSTC is on
DWI (WINZ) EMS schedules only as DWI are the only (beneficiary) ‘employer’ to fill in this
column, so it does not appear on the standard EMS form.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_sl_amt
Definition: Student loan repayments – student loan deduction amount for the line item.
The amount is always displayed as a negative number. The student loan amount is then
subtracted from the total student loan.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_withholding_type_code
Definition: P for PAYE deductions, W for withholding tax deductions.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_income_source_code
Definition: Code representing the source of income.
Format: Character, 3A
Name of classification: W&S – wages and salary, WHP – withholding payment, BEN –
benefits, STU – Student Allowance, PPL – Paid Parental Leave, PEN – Pensions
(superannuation), CLM – Claimants Compensation.
Notes:
_______________________________________
Variable name: ir_ems_employee_start_date
Definition: Start date of the employee. Is entered by the employer on the EMS.
12
IDI Data Dictionary: IR tax data (September 2015 edition)
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_employee_end_date
Definition: End date of the employee. Is entered by the employer on the EMS.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_lump_sum_ind
Definition: Flag to indicate a lump sum payment.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_tax_code
Definition: Tax code of employee. The tax code at which deductions have been made for
the employee for this line item number eg 'M' main source of income. Only one job can
have this code at any one time.
Format: Character, 6A
Name of classification: tax_codes
Notes:
_______________________________________
Variable name: ir_ems_return_line_item_code
Definition: Status code. A code is an abbreviation for a return line item status. Status
values are 'A' active or 'I' inactive.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_processed_date
Definition: Process date.
13
IDI Data Dictionary: IR tax data (September 2015 edition)
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Variable name: ir_ems_enterprise_nbr
Definition: A unique identifier generated by Statistics NZ for an enterprise. An enterprise
is an institutional unit and generally corresponds to legal entities operating in New
Zealand. It can be a company, partnership, trust, estate, incorporated society, producer
board, local or central government organisation, voluntary organisation, or self-employed
individual.
Format: 10A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_pbn_nbr
Definition: Permanent Business Number. 10-character code, consisting of 'PB' prefix,
followed by a unique 8-digit number. This is a Statistics NZ generated construct for a
geographically located business unit.
Format: 10A
Name of classification:
Notes:
14
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
4
Data dictionary for ird_addresses
Dataset description
Contents of dataset: This table contains geocoded address information for an individual.
Summary table
IDI variable name
Primary
key
snz_uid
Mandatory
Format
Y
N
snz_ird_uid
Y
Y
N
ir_apc_location_nbr
Y
Y
4N
ir_apc_address_type_code
Y
Y
1A
ir_apc_snz_unique_nbr
N
N
ir_apc_applied_date
Y
Datetime
Classification
name
Source variable name
ird_number
location_number
address_types
address_type
date_applied
ir_apc_tax_type_code
Y
Y
3A
tax_types
ir_apc_main_address_ind
Y
Y
1A
main_address_indicator
ir_apc_post_code
N
6A
post_code
ir_apc_address_status_code
N
1A
ir_apc_ceased_date
N
Datetime
date_ceased
ir_apc_ird_timestamp_date
Y
Datetime
timestamp
ir_apc_region_code
N
2A
ir_apc_ta_code
N
3A
ir_apc_meshblock_code
N
7A
ir_apc_meshblock_imputed_ind
N
1A
snz_idi_address_register_uid
N
N
address_status
tax_type
address_status
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: 7N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
15
IDI Data Dictionary: IR tax data (September 2015 edition)
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_address_type_code
Definition: Type of address a client may have e.g. 'L' - Physical Location Address, 'P'Postal address, 'R' -Registered Office, 'S' - Specific address, etc.
Format: Character, 1A
Name of classification: address_types
Notes:
_______________________________________
Variable name: ir_apc_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_applied_date
Definition: Date from which record became valid
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_tax_type_code
Definition: Tax code.
16
IDI Data Dictionary: IR tax data (September 2015 edition)
Format: Character, 3A
Name of classification: tax_types
Notes:
_______________________________________
Variable name: ir_apc_main_address_ind
Definition: Y/N indicator that denotes whether the address is the client's main address. A
client may have more than one main address.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_post_code
Definition: This is a numeric code that has been assigned by the NZ Post for an area
within New Zealand and is used for the delivery of mail
Format: Character, 6A
Name of classification:
Notes: In the post code field approximately 90 percent of data is available.
_______________________________________
Variable name: ir_apc_address_status_code
Definition: Current address status of customer, eg 'D' return to district office, 'I' invalid
address, 'O' overseas address, 'V' valid address etc
Format: Character, 1A
Name of classification: address_status
Notes:
_______________________________________
Variable name: ir_apc_ceased_date
Definition: Date from which record ceased to be valid
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
17
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_apc_ird_timestamp_date
Definition: Indicates when data was extracted from into Inland Revenue’s data
warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Variable name: ir_apc_region_code
Definition:
Format: 2A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_ta_code
Definition:
Format: 3A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_meshblock_code
Definition: A seven digit mesh block number which is the lowest level of a customer's
geographic location.
Format: 7A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_meshblock_imputed_ind
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
18
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: snz_idi_address_register_uid
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
19
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
5
Data dictionary for ird_customers
Dataset description
Contents of dataset: This table holds birth_month, birth_year, and entity_type.
Summary table
IDI variable name
Primary
key
Mandatory
Format
Classification
name
snz_uid
Y
Y
N
snz_ird_uid
Y
Y
N
ir_cus_snz_unique_nbr
Y
Y
N
ir_cus_location_nbr
Y
Y
4N
ir_cus_entity_type_code
Y
1A
entity_types
entity_type
ir_cus_entity_class_code
Y
2A
entity_classes
entity_class
ir_cus_client_status_code
Y
1A
client_status
client_status
ir_cus_applied_date
N
Datetime
date_applied
ir_cus_ceased_date
N
Datetime
date_ceased
ir_cus_birth_year_nbr
N
4N
date_of_birth
ir_cus_birth_month_nbr
N
2N
date_of_birth
Source variable name
ird_number
location_number
ir_cus_org_commencement_dat
e
N
Datetime
org_commencement_dat
e
ir_cus_loan_indicator_code
N
1A
loan_indicator
ir_cus_resident_indicator_code
N
1A
resident_indicator
ir_cus_sic_code
N
8A
sic_codes
sic_code
Detailed information
_______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
20
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_location_nbr
Definition: Location number of taxpayer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_entity_type_code
Definition: Type of entity eg C = company, M = Māori authority, P = partnership, I=
individual etc.
Format: Character, 1A
Name of classification: entity_types
Notes:
_______________________________________
Variable name: ir_cus_entity_class_code
Definition: Class of entity eg BS = Building Society, UT = unit trust, SW = salary or
wages etc.
Format: Character, 2A
Name of classification: entity_classes
Notes:
_______________________________________
Variable name: ir_cus_client_status_code
Definition: Status of the client eg C = ceased, B = bankrupt, A = active, L = liquidation, R
= receivership, M = amalgamated company, S = struck off, U = undischarged bankrupt.
Format: Character, 1A
21
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification: client_status
Notes: e.g. active/bankrupt/ceased active
_______________________________________
Variable name: ir_cus_applied_date
Definition: Date from which the record became active.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_ceased_date
Definition: Date from which the record became inactive.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_birth_year_nbr
Definition:
Format: 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_birth_month_nbr
Definition:
Format: 2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_org_commencement_date
Definition: Commencement date for any entity other than an individual, ie company,
partnership, trust etc. Loan transfer date.
Format: Datetime, yyyymmdd
Name of classification:
Notes: May be set to 1/1/1970 if unknown.
22
IDI Data Dictionary: IR tax data (September 2015 edition)
_______________________________________
Variable name: ir_cus_loan_indicator_code
Definition: ‘Y’ indicates presence of student loan for individuals.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_resident_indicator_code
Definition: NZ resident / non-resident for tax purposes (R/N)
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_sic_code
Definition: Industry Code, eg 511010 = supermarkets, 523100 = furniture retailing.
Format: Character, 8A
Name of classification: sic_codes
Notes:
_______________________________________
23
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
6
Data dictionary for ird_client_names
Dataset description
Contents of dataset: This table holds sex and client status information.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
snz_ird_uid
Y
Y
N
Y
N
IDI variable name
ir_cli_snz_unique_nbr
Format
tory
Classification
name
Source variable name
ird_number
ir_cli_location_nbr
Y
N
4N
ir_cli_client_name_type
_code
Y
N
2A
ir_cli_sequence_nbr
Y
N
3N
sequence_number
ir_cli_applied_date
Y
N
Datetime
date_applied
ir_cli_sex_snz_code
N
1A
ir_cli_sex_imp_code
Y
1A
ir_cli_ceased_date
N
Datetime
date_ceased
ir_cli_ird_timestamp_da
te
N
Datetime
timestamp
location_number
client_name_type
client_name_type
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_________________________________________
24
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_cli_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_client_name_type_code
Definition: A code denoting the client name type eg P = preferred name, S = secondary
name etc.
Format: Character, 2A
Name of classification: client_name_type
Notes:
_______________________________________
Variable name: ir_cli_sequence_nbr
Definition: A (sequence) number is the numeric code given to each of a client's names
within the combination of IRD number, location number and client name type. It is not a
serial number, but duplicates the code in the client name type entity. There is a 1:1
relationship between client name number and client name type code: No. Code 10 = P 20
= S 30 = A 40 = C 50 = T.
Format: Numeric, 3N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_applied_date
Definition: Date from which the record became valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
25
IDI Data Dictionary: IR tax data (September 2015 edition)
_______________________________________
Variable name: ir_cli_sex_snz_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_sex_imp_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_ceased_date
Definition: Date from which the record became invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes: new name, death etc.
_______________________________________
Variable name: ir_cli_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________
26
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
7
Data dictionary for ird_tax_registrations
Dataset description
Contents of dataset: This table holds information about tax types.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
snz_ird_uid
Y
Y
N
ird_number
ir_treg_location_nbr
Y
Y
4N
location_number
ir_treg_tax_type_code
Y
Y
3A
ir_treg_applied_date
Y
Y
Datetime
date_applied
ir_treg_snz_unique_nbr
Y
Y
N
ir_treg_snz_unique_nbr
ir_treg_treg_start_date
Y
Y
Datetime
treg_date_start
ir_treg_treg_end_date
Y
Datetime
treg_date_end
ir_treg_filing_frequency_
code
N
2A
tax_filing_freq
uency
filing_frequency
ir_treg_treg_status_code
N
1A
tax_reg_status
treg_status
ir_treg_ceased_date
N
Datetime
ir_treg_posting_ind_code
N
1A
ir_treg_electronic_filing_i
nd
N
1A
electronic_filing_ind
ir_treg_corporate_filing_i
nd
N
1A
corporate_filing_ind
ir_treg_has_agent_ind
Y
1A
has_agent_ind
ir_treg_ird_timestamp_da
te
Y
Datetime
timestamp
IDI variable name
Format
tory
Classification
name
tax_types
Source variable name
tax_type
date_ceased
posting_indicat
ors
posting_ind
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
27
IDI Data Dictionary: IR tax data (September 2015 edition)
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_tax_type_code
Definition: Tax type
Format: Character, 3A
Name of classification: tax_types
_______________________________________
Variable name: ir_treg_applied_date
Definition: Date from which the record became active.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_treg_start_date
Definition: Date the client first registered for a particular tax type
Format: Datetime, yyyymmdd
28
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_treg_end_date
Definition: Date the client deregistered for a particular tax type.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_filing_frequency_code
Definition: eg D = twice monthly, Q = quarterly, I = irregularly
Format: Character, 2A
Name of classification: tax_filing_frequency
Notes:
_______________________________________
Variable name: ir_treg_treg_status_code
Definition: Active/Ceased 'X' Unknown
Format: Character, 1A
Name of classification: tax_reg_status
Notes:
_______________________________________
Variable name: ir_treg_ceased_date
Definition: Date from which the record became invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes: new name, death etc.
_______________________________________
Variable name: ir_treg_posting_ind_code
Definition: Distinguishes the type of address eg P = postal, Q = liquidator, A = agent etc.
Format: Character, 1A
Name of classification: posting_indicators
29
IDI Data Dictionary: IR tax data (September 2015 edition)
Notes:
_______________________________________
Variable name: ir_treg_electronic_filing_ind
Definition: 'Y' if an electronic filer, 'N' if paper filer
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_corporate_filing_ind
Definition: Indicates whether the customer is part of a corporate filing group.
Format: Character, 1A
Name of classification:
Notes: 'N' = not part of a group, 'P' = parent, 'S' = subsidiary
_______________________________________
Variable name: ir_treg_has_agent_ind
Definition: Indicates whether a tax agent acts on behalf of the customer.
Format: Character, 1A
Name of classification:
Notes: 'Y' = yes, 'N' = no.
_______________________________________
Variable name: ir_treg_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________
30
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
8
Data dictionary for ird_cross_reference
Dataset description
Contents of dataset: This table is maintained by IR and holds information about the set
of relationships between two IRD numbers. Most of the information on this table is found
when the annual returns are processed.
As most of the other information is found on annual returns, it’s only when the returns are
processed that information may be validated. However, it may not always occur.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
ir_xrf_from_snz_ird_uid
Y
Y
N
ird_number_from
ir_xrf_to_snz_ird_uid
Y
Y
N
ird_number_to
ir_xrf_applied_date
Y
Y
Datetime
date_applied
N
Datetime
date_ceased
Y
3A
N
4N
first_year
Y
4N
latest_year
Y
Datetime
timestamp
IDI variable name
ir_xrf_ceased_date
ir_xrf_reference_type_code
Y
ir_xrf_first_year_nbr
ir_xrf_latest_year_nbr
Y
ir_xrf_ird_timestamp_date
Format
tory
Classification
name
cross_referenc
e_types
Source variable
name
reference_type
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_xrf_from_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
31
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_xrf_to_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_applied_date
Definition: Date from which record is valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_ceased_date
Definition: Date from which record is invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_reference_type_code
Definition:
AAC
Amalgd/Amalging Co
ASS
Associated Person
BAN
Bankrupt
BEN
Beneficiary
DEC
Deceased
DEP
Dependent
DIR
Director
DUP
Duplicate IRD No
EOH
Exec Office Holder
GPR
GENERAL PARTNER
IGN
NOMINATED ICA CO
JVT
Joint Venture
LPR
LIMITED PARTNER
LQR
LIQUIDATOR
LTI
LOOK-THROUGH INT
LTO
LOOK THROUGH OWNER
NOM
Nominated Company
NOP
NOMINEE
NOR
NOMINATOR
32
IDI Data Dictionary: IR tax data (September 2015 edition)
NRC
PTR
SHR
SPO
SUB
TEE
TRA
VAD
NON RES CHLD SUPPT
Partner
Shareholder
Spouse/Defacto
Subsidiary Company
Trustee
TRANSITIONAL CLIEN
VOLUNTARY ADMINIST
Format: Character, 3A
Name of classification: cross_reference_types
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_first_year_nbr
Definition: Start date of the cross reference relationship.
Format: Numeric,
Name of classification:
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_latest_year_nbr
Definition: Latest year of the cross reference relationship.
Format: Numeric,
Name of classification:
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data .warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
33
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
9
Data dictionary for ird_rtns_keypoints_ir3
Dataset description
Contents of dataset: This table contains information for the active items which have
non-zero partnership, self-employment, or shareholder salary income.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
snz_ird_uid
Y
Y
N
ird_number
ir_ir3_location_nbr
Y
Y
4N
location_number
Y
Datetime
return_period_date
Y
N
ir_ir3_tot_pship_income_amt
N
13.2N
total_partnership_income_808
ir_ir3_tot_sholder_salary_amt
N
13.2N
total_shareholder_salary_809
ir_ir3_net_profit_amt
N
13.2N
net_profit_702
ir_ir3_income_imp_ind
Y
1A
ir_ir3_net_rents_826_amt
N
13.2N
net_rents_826
ir_ir3_tot_wholding_paymnts_
amt
N
13.2N
tot_w_holding_payments_100
514
ir_ir3_tot_expenses_claimed_
amt
N
13.2N
total_expenses_claimed_1512
ir_ir3_gross_earnings_407_a
mt
N
13.2N
gross_earnings_407
ir_ir3_ird_timestamp_date
N
Datetime
timestamp
IDI variable name
ir_ir3_return_period_date
ir_ir3_snz_unique_nbr
Y
Format
tory
Classification
name
Variable name
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
34
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_location_nbr
Definition: Location number of the EMS filer (payroll system).
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_return_period_date
Definition: Period covered by return.
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_tot_pship_income_amt
Definition: Partnership income.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_tot_sholder_salary_amt
Definition: Shareholder salary income.
Format: Numeric, 13.2N
Name of classification:
35
IDI Data Dictionary: IR tax data (September 2015 edition)
Notes:
_______________________________________
Variable name: ir_ir3_net_profit_amt
Definition: Self-employment income.
Format: Numeric, 13.2N
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_net_rents_826_amt
Definition: Net rental income
Format: Numeric, 13.2
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_tot_wholding_paymnts_amt
Definition: Total gross earnings (with withholding tax deducted at source).
Format: Numeric, 13.2N
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_tot_expenses_claimed_amt
Definition: Total expenses claimed.
Format: Numeric, 13.2N
Name of classification:
Notes:
36
IDI Data Dictionary: IR tax data (September 2015 edition)
_________________________________________
Variable name: ir_ir3_gross_earnings_407_amt
Definition: Gross earnings with PAYE deducted at source.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
37
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
10 Data dictionary for ird_attachments_ir20
Dataset description
Contents of dataset: This table contains information for active items which have nonzero partnership income.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
snz_ird_uid
Y
Y
N
ird_number
Y
N
employer_ird_number
IDI variable name
snz_employer_ird_uid
Format
tory
Classification
name
Variable name
ir_ir20_location_nbr
Y
Y
4N
location_number
ir_ir20_return_period_date
Y
return_period_date
Y
Datetime
ir_ir20_snz_unique_nbr
Y
N
ir_ir20_tot_share_of_inc_8
65_amt
N
13.2N
tot_share_of_inc_865_amt
ir_ir20_income_imp_ind
Y
1A
income_imp_ind
ir_ir20_ird_timestamp_date
N
Datetime
timestamp
Detailed information
_______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_employer_ird_uid
38
IDI Data Dictionary: IR tax data (September 2015 edition)
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir20_location_nbr
Definition: Location number of the payer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir20_return_period_date
Definition: The return period.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_tot_share_of_inc_865_amt
Definition: Value of partnership income.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
39
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_ir20_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
40
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
11 Data dictionary for ird_attachments_ir4s
Dataset description
Contents of dataset: This table holds information about the active items which have
non-zero shareholder income.
Summary table
IDI variable name
snz_uid
Primary
key
Manda-
Y
Y
N
Y
N
ird_number
snz_ird_uid
Format
tory
Classification
name
Source variable name
snz_employer_ird_uid
Y
Y
N
employer_ird_number
ir_ir4_location_nbr
Y
Y
4N
location_number
ir_ir4_return_period_date
Y
Y
Datetime
return_period_date
ir_ir4_snz_unique_nbr
Y
Y
N
ir_ir4_tot_sholder_sal_809
_amt
N
13.2N
total_shareholder_salary_
809
ir_ir4_income_imp_ind
Y
1A
income_imp_ind
ir_ir4_ird_timestamp_date
Y
Datetime
timestamp
Detailed information
_______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
41
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: snz_employer_ird_uid
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR
unique identifier (IRD number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates
that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir4_location_nbr
Definition: Location number of the payer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir4_return_period_date
Definition: The return period.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_tot_sholder_sal_809_amt
Definition: Value of shareholder salary.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
42
IDI Data Dictionary: IR tax data (September 2015 edition)
Variable name: ir_ir4_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
43
12 Data dictionary for ird_old_systems_numbers
Dataset description
Contents of dataset: This table contains the mapping of IRD numbers from old system to the
new system.
Summary table
Primary
key
Manda-
snz_uid
Y
Y
N
ir_osn_old_snz_ird_uid
Y
Y
N
old_system_number
snz_ird_uid
Y
N
ird_number
ir_osn_location_nbr
Y
N
location_number
IDI variable name
ir_osn_applied_date
Y
Format
tory
Classification
name
Source variable name
Y
Datetime
date_applied
ir_osn_ceased_date
N
Datetime
date_ceased
ir_osn_ird_timestamp_date
Y
Datetime
timestamp
Detailed information
_________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each
distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_osn_old_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR
unique identifier (ird number). This identifier will remain the same for an identity across
refreshes. Where we receive more information during a subsequent refresh that indicates that
two or more identities represent the same identity, the identifier may change.
Format: N
44
IDI Data Dictionary: IR tax data (September 2015 edition)
Name of classification:
Notes:
_______________________________________
Variable name: ir_osn_location_nbr
Definition: Location number of the payer.
Format: Numeric,
Name of classification:
Notes:
_________________________________________
Variable name: ir_osn_applied_date
Definition: Date from which record is valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_osn_ceased_date
Definition: Date from which record is invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________________
Variable name: ir_osn_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________________
45
13 Glossary
Term
Definition
IDI name
(Stats NZ) The variable names in the IDI SQL database.
Mandatory field
(IR) Indicates a field which cannot be “null”.
Primary key
(Stats NZ) An identifier for a unique database item (may
consist of a single item or multiple items in combination).
46