What is metadata? How do I understand it? How can I contribute?
Metadata is the structured data that describes the data associated with a collection. Imagine a spreadsheet with column headings that signify the kinds of data that go into the cells of that column. One heading might read, “date of birth,” another, “country of origin.” These columns create the fields that structure the data that gets collected from each document, collectively forming our metadata schema. On the digital object pages, however, this data does not appear in spreadsheet form, but as a list right beneath the image viewer. The full collection of metadata is available for download and reuse here.
NARA provides a few metadata fields with their digitized collection, namely, date of birth, date of entry, port of entry, and country of birth, but little else. So, unlike NARA’s minimal metadata that applies to the collective A-File, Keeping Records and the Golden Gate assigns metadata on the document level. That means we assigned metadata to every document within each A-File, making Keeping Records and the Golden Gate’s metadata more expansive, precise, and searchable. However, some documents do not contain the same data as others, and since our metadata is on the document level, not every object will have data for every field. Any field left blank does not appear on the object viewer page.
From its inception, this project has been collaborative in every aspect. We invite anyone to join the team by contributing to our metadata. You can use this form to add metadata for any document that needs it, or you can use the same form to suggest any changes to the exiting metadata. Use the guide below to learn about our schema, and this article to learn more about the reasons we made the choices we did in structuring the data.
Guide to Each Field
pid | label | type | description | date_created | temporal_coverage | creator | contributor | recipient | country_of_origin | source_origin | a_number | family_name | given_name | additional_name | alias | nationality | race | complexion | color | ethnicity | birth_place | birth_date | death_date | date_of_entry | port_of_entry | date_of_naturalization | residence | sex | family_members | occupation | status_enumeration |
pid
- These match the names of the images that are in the image directories but without .jpg
- The pid also corresponds with the Alien Number (A-Number) for the corresponding Alien File (A-File)
label
- A succinct summary of key metadata from the document. The general rule is as follows:
- If it says “official form” or has some kind of modifier before the word “form” just enter it as “form.”
- If it is a correspondence, then include:
- letter from (creator) to (recipient)
- For any document type that is entirely unidentifiable (like a little scrap with only a tiny amount of information, or the scan or is bad or document appears in unreadable condition) us “[?]” as a place holder.
type
This is one of the toughest fields as many document types could be classified under multiple categories. We primarily distinguish between documents that have their origin within a government agency and those created by non-governmental actors. As a general rule, if the document is clearly a form, and especially if it has an identifiable form number, then that takes precedence over any other classification.
- Government Forms and Certificates
- A document that has preset fields that get filled out, especially if it has a clearly identifiable form number.
- Many documents will be a form and could potentially be classified as another type of document (e.g. report, affidavit, etc.) In our heirarchy, however, anything that is a form gets classified as a form regardless of the other categories it might fall under.
- Some forms have check-box style lists of recipients. Only list the checked recipients.
- If form require or have photo attached, note this in “description.”
- Non-Government Forms and Certificates
- A document that has preset fields that get filled out
- Usually serves as proof of something (e.g. residence, membership in a church or organization etc).
- If form require or have photo attached, note this in “description.”
- Government Correspondence, Reports, and Memos
- A document that is not a form, but whose creator is typically a government employee or official
- Usually meant for more than one recipient
- Anything clearly labeled memo or memorandum
- Includes investigative reports
- Correspondences between government officials or government departments, bureaus, organizations, etc.
- Correspondence originating with government employee or officials, but not necessarily sent to a government recipient, in which case the document will typically have government letterhead.
- Creator is the person or entity writing and sending the document. Recipient is whom the document is being sent to. If the person recipient is not identifiable, use government agency as you would with source origin
- Non-Government Correspondence, Reports, and Memos
- Letters, memos, etc. not originating with a government employee or official
- Usually intended for a smaller audience or single recipient
- Letters written by civilians to government officials, agencies, etc.
- Creator is the sender. Recipient is who the letter is being sent to. If the person recipient is not identifiable, use government agency as you would with source origin.
- Photograph
- Use for stand-alone photographs.
- Often photos are attached to forms - in which case use Government Forms and Certificates.
- If forms require or have photo attached, note this in “description.”
description
- Relevant information, clarifications, or descriptions that do not fit under any other field.
- Include any notable irregularities or aspects that make it difficult to understand/read etc.
date_created
- Year, Month, Day (example: 1988-06-05 for June 5, 1988) as per the ISO 8601format.
- Each date value has a fixed number of digits which must be padded with leading zeros.
- Pertains to the creation of the document, and not the details of the life of the immigrant.
- If the document is a replicatable form, this is not the date the template for the form was created or updated.
- If its a form, its usually the date the person filled out the form or the earliest date in a temporal coverage.
- Format Guide:
Year Known |
Month Known |
Day Known |
ISO 8601 |
Example |
Yes |
Yes |
Yes |
YYYY-MM-DD |
2020-02-20 |
Yes |
Yes |
No |
YYYY-MM-- |
2020-02 |
Yes |
No |
No |
YYYY---- |
2020----- |
Yes |
No |
Yes |
YYYY---DD |
2020---20 |
No |
Yes |
Yes |
--MM-DD |
--02--20 |
No |
No |
Yes |
----DD |
----20 |
No |
Yes |
No |
--MM-- |
--02-- |
temporal_coverage
- Year, Month, Day (example: 1988-06-05 for June 5, 1988) as per the ISO 8601 format.
- Most government documents have multiple dates. This field attempts to track all the stages of a document’s processing.
- A filled out form likely has a date when it was filled out, and other dates where various officials stamp or watermark the document. Use temporal coverage to track them all.
- For multiple dates, put them in order and separate them out with a forward slash.
example: 1942-05-14/1942-06-01/1942-06-03
- Open-ended date ranges can be written with “..” in place of the end date. For example, “2015-11--/..” indicates a range beginning in November 2015 and with no specified final date (there will be few, if any documents in this collection that this applies to).
creator
- In most cases, the person most responsible for the work’s creation.
- The creator or author of the document.
- If it is a government form, then the creator is whichever government official signed off on the form or was most responsible for getting someone to fill it out. While it may seem strange that this is not the person filling out the form, we hold that the government official that can be identified as the principal instigator for the form getting filled is the creator. (Click here for more on this logic.
- If the name of the creator is indiscernible, make best attempt and end with “[?]”. If only one of either family name or given name are illegible, replace the illegible portion with “[?]”.
- Does NOT need to be written as Family name, Given Name, Additional Name (last, first middle). Include as it appears on the document.
- Title’s aren’t necessary, but if all the document lists is a title, then this is used in place of the creator’s name.
contributor
- This is anyone in addition to the creator.
- Example 1: John Lee writes a letter, and Leonard Schmieglestein notarizes the document. Lee is the creator, where Schmielglestein is the contributor.
- Example 2: Millard Wilson signs a government form filled out by an immigrant, Humphrey Bjorgbork. Wilson is the creator, Bjorgbork is the contributor.
- Does NOT need to be written as Family name, Given Name, Additional Name (last, first middle). Include as it appears on the document.
- Title’s aren’t necessary, but if all the document lists is a title, then this is used in place of the contributor’s name.
recipient
- Required when type is “correspondence.”
- If a title (e.g. “District Director”) is all you have for the recipient, then use that for the recipient.
- If the recipient is something like “To whomever it may concern” then recipient field is left empty.
- When there is no individual named but only an agency or organization, then the organization too (in the same fashion you would an org. with source origin.)
Now let’s say you have “to the district director” AND you know the organization (not by inferring it, but because it is on the doc) then list: District Director | United States Department of Justice | Immigraiton and Naturalization Service
country_of_origin
- The country of the principal offices of the document or origin of the creator. NOT the country of the immigrant.
- Write “United States” instead of “United States of America” or USA if country of origin is the United States of America.
source origin
- The Organization on whose behalf the creator was working.
- If it says “United States Department of Justice, Immigraiton and Naturalization Service,” then separate the department and agency with a vertical pipe.
- Example: United States Department of Justice | Immigraiton and Naturalization Service
a_number
- Use capital “A” in A-Number.
- List only if it appears on the document.
family_name
- This is only for the person who the government created the A-File for.
- Otherwise known as last name or surname.
- Be careful about language and non-English customs. (e.g.Spanish speakers who might have two last names, Matronimics, and Patronimics).
- Record as it appears on the document. If the name is obviously mispelled or if there are discrepancies within the document or from one document to the next, then note that in the description field.
given_name
- This is only for the person who the government created the A-File for.
- Otherwise known as first name.
- Record as it appears on the document. If the name is obviously mispelled or if there are discrepancies within the document or from one document to the next, then note that in the description field.
additional_name
- This is only for the person who the government created the A-File for.
- Use for middle name.
- Use for maiden name or née.
- Record as it appears on the document. If the name is obviously mispelled or if there are discrepancies within the document or from one document to the next, then note that in the description field.
alias
- This is only for the person who the government created the A-File for.
- Use for nicknames or any other names that are used to identify them on the document, but aren’t their legal names.
nationality
- Nationality of the person to whom the A-File belongs.
- Correct for unusual syntax (i.e. use “Japanese” instead of “Japan” even if document says “Japan”).
- If the document lists current and former nationality, use current but note former in the description.
race
- List only if the document enumerates it. Not to be confused with complexion, color, or ethnicity.
- Click here for the intellectual rationale behind this metadata field.
complexion
- List only if the document enumerates it. Not to be confused with race, color, or ethnicity.
- Most likely color words or descriptions of skin tone like “light” and dark.
- Click here for the intellectual rationale behind this metadata field.
color
- List only if the document enumerates it. Not to be confused with complexion, color, or ethnicity.
- Click here for the intellectual rationale behind this metadata field.
ethnicity
- Use this field where known, even though it likely will not be listed on any A-File documents from this time period.
- Click here for the intellectual rationale behind this metadata field.
birth_place
- Do not surmise this category by nationality, citizenship, or names etc.
birth_date
- Year, Month, Day (example: 1988-06-05 for June 5, 1988) as per the ISO 8601 format.
death_date
- Year, Month, Day (example: 1988-06-05 for June 5, 1988) as per the ISO 8601 format.
date_of_entry
- ISO 8601 format.
- While the subject of the A-File may have entered the United States or crossed the border multiple times over the course of their life, most likely a document will only record the date of entry date relevant to that document.
- Where multiple dates of entry do occur on a single document, use vertical pipe to separate them.
port_of_entry
- City, County, State, Country.
- While the subject of the A-File may have entered the United States or crossed the border multiple times over the course of their life, most likely a document will only record the date of entry date relevant to that document.
- Where multiple ports of entry do occur on a single document, use vertical pipe to separate them.
date_of_naturlaization
residence
- City, County, State, Country.
- Use vertical pipe if multiples are mentioned.
sex
- Given the era from which our documents are drawn, there is a near certainty that the term “gender” will not appear on these documents.
- Click here for the intellectual rationaler behind this metadata field.
family_members
- Enter as it appears on the document or in the following order: Given name, additional name, family name (relationship)
-
Example: Richard Bryan Zehngut-Willits (brother) |
Monica Rachelle Willits-Rykse (sister) |
status_enumeration
- Enumerations dealing with status types, especially immigraiton status.
- Example: Enemy Alien / alien enemy., Asylee, Refugee, etc.
- Begin the status with a capital letter, but does not need to read like headline style
- Example: Enemy alien parolee NOT Enemy Alien Parolee