This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
public:data_collection [2020/01/30 14:58] admin Edited for SD, example of sources tbd |
public:data_collection [2020/02/06 13:16] (current) admin removed irrelevant section |
||
---|---|---|---|
Line 5: | Line 5: | ||
Collection or creation of data is the first step in the life cycle of data management. In many cases, data already exists but must be found and then cleaned (checked for accuracy and consistency), organized for a specific purpose, saved, shared, and updated. Data collection should be conducted with an awareness of what data is of interest, to whom, and how they will use it, otherwise irrelevant or incomplete data will be collected. | Collection or creation of data is the first step in the life cycle of data management. In many cases, data already exists but must be found and then cleaned (checked for accuracy and consistency), organized for a specific purpose, saved, shared, and updated. Data collection should be conducted with an awareness of what data is of interest, to whom, and how they will use it, otherwise irrelevant or incomplete data will be collected. | ||
- | There is a hierarchy of sources for data collection: | ||
- | |||
- | * **Primary source**: the entity that is directly responsible for creating the original version of the information (not translated or otherwise modified). | ||
- | |||
- | '' For example, if the government is understood as the top authority for issuing a contract to a company, then a signed and stamped government document describing the contract is the primary(and preferred) source. If available, primary sources in their original language should always be the preferred type for data collection. '' | ||
- | |||
- | * **Secondary source**: an entity that may not be a main actor or have complete authority but is still involved in documentation. | ||
- | * ''For example, a newspaper article about a contract. A company-issued press release about the contract could also be considered a secondary source, depending on the focus of interest (some might define the company as a main actor, making it a primary source). Consider the reputability of a secondary source before collecting data from it. '' | ||
- | |||
- | During data collection, also identify who is responsible for each part and have a system for tracking who worked on what, so that any later questions can be appropriately directed. Checking data can be more time-consuming than setting high, consistently applied standards for collecting it in the first place. If there is conflicting information from the same source, or missing information, make a clear note about the issue so that further review can be done later. | ||