This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
public:open_data [2020/01/29 16:36] admin created |
public:open_data [2021/12/17 00:08] (current) lukebinns [Disclaimer] |
||
---|---|---|---|
Line 9: | Line 9: | ||
* **Availability and access**: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form | * **Availability and access**: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form | ||
* **Reuse and redistribution**: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. | * **Reuse and redistribution**: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. | ||
- | * **Universal participation**: everyone must be able to use, reuse and redistribute - there should be no discrimination against fields of endeavour or against persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not allowed. | + | * **Universal participation**: everyone must be able to use, reuse and redistribute - there should be no discrimination against fields of endeavour or against persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not fully open. |
- | Why is it so important to be clear about the definition of "open"? The answer: interoperability, which is the ability of diverse systems and organizations to work together (inter-operate). In this case, it is the ability to usefully combine datasets from different countries in the lower Mekong region. | + | Why is it so important to be clear about the definition of "open"? The answer: interoperability, which is the ability of diverse systems and organizations to work together (inter-operate). Interoperability allows for different components to work together. This ability to make components and to plug them together is essential to building large, complex systems. Without interoperability this becomes nearly impossible. |
- | + | ||
- | Interoperability allows for different components to work together. This ability to make components and to plug them together is essential to building large, complex systems. Without interoperability this becomes nearly impossible. | + | |
The core of a “commons” of data (or code) is that one piece of “open” material can be freely intermixed with other “open” material. This interoperability is key to achieving the main practical benefits of “openness”: the enhanced ability to combine different datasets together and thereby to develop more and better products and services. This ability to combine separate pieces from different sources into larger, more sophisticated systems is the real value of the openness standard. | The core of a “commons” of data (or code) is that one piece of “open” material can be freely intermixed with other “open” material. This interoperability is key to achieving the main practical benefits of “openness”: the enhanced ability to combine different datasets together and thereby to develop more and better products and services. This ability to combine separate pieces from different sources into larger, more sophisticated systems is the real value of the openness standard. | ||
+ | ===== Disclaimer ===== | ||
+ | |||
+ | All data linked to this Open Data portal is published “as is”. The Information is licensed 'as is' and the Information Provider and/or Licensor excludes all representations, warranties, obligations and liabilities in relation to the Information to the maximum extent permitted by law. The Information Provider and/or Licensor are not liable for any errors or omissions in the Information and shall not be liable for any loss, injury or damage of any kind caused by its use. The Information Provider does not guarantee the continued supply of the Information. | ||
===== Why open data? ===== | ===== Why open data? ===== | ||
Line 27: | Line 28: | ||
* **Participation and engagement – participatory governance or for business and organizations engaging with your users and audience**: Much of the time citizens are only able to engage with their own governance sporadically — maybe just at an election every 4 or 5 years. By opening up data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the process of governance but being able to contribute to it. | * **Participation and engagement – participatory governance or for business and organizations engaging with your users and audience**: Much of the time citizens are only able to engage with their own governance sporadically — maybe just at an election every 4 or 5 years. By opening up data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the process of governance but being able to contribute to it. | ||
- | ===== Which file formats are better for open data? ===== | ||
- | When exploring the Open Development Mekong data catalog, you are likely to find data provided in a variety of formats. The formats were chosen to best match the information the data describes: images as PNGs and TIFFs, text documents as Word Documents (DOC), Text (TXT) files, and sometimes PDFs, spreadsheets as CSVs, and geospatial information as GeoJSON, TopoJSON, KML, and ESRI Shapefile. These are all considered open formats and can be used freely in many applications and on most computer operating systems. | ||
- | The information below will help you better understand the file formats used in the ODM catalog and how you can begin investigating their contents. | + | ===== Which file formats are better for open data? ===== |
- | + | ||
- | === CSV (Comma-Separated Value) === | + | |
- | * What: A tabular (spreadsheet) data format, where the column values are separated by commas. CSV files are both human and machine readable. In the wild, you may see many other "delimiters" used, including tabs. | + | |
- | * How: Many applications, including Microsoft Excel, OpenOffice, and Google Docs and by text editors like Sublime Text, TextWrangler, Apple TextEdit, and Microsoft Notepad. When the CSV includes geographic coordinates, you may also open them in desktop mapping applications, such as TileMill and QGIS, and with web-mapping tools, like GeoJSON.io and CartoDB. | + | |
- | === DOC (Microsoft Word document) === | + | When exploring the SmartDublin data catalog, you will find data provided in a variety of formats. The formats were chosen to best match the information the data describes: spreadsheets as CSVs, and geospatial information as GeoJSON or KML, and data provided through APIs (application programming interface). These are all considered open formats and can be used freely in many applications and on most computer operating systems. There is a star rating that indicates the Openness of the data see [[https://5stardata.info/en/| 5-star Open Data]]: |
- | * What: A widespread document format developed by Microsoft for word processing. | + | | ** 5 Star Open Data ** | |
- | * How: In addition to Microsoft Word, you can open DOC files in OpenOffice, Apple Pages, LibreOffice, and other word processors. | + | |★ | make your stuff available on the Web (whatever format) under an open license | |
+ | |★★ | make it available as structured data (e.g., Excel instead of image scan of a table) | | ||
+ | |★★★ | use non-proprietary formats (e.g., CSV instead of Excel) | | ||
+ | |★★★★ | use URIs to denote things, so that people can point at your stuff | | ||
+ | |★★★★★| link your data to other data to provide context | | ||
+ | //To be clear, it is preferable to publish your data with as many stars as possible, eg. as CSV rather than Excel.// | ||
+ | ====Costs and Benefits==== | ||
- | === JPEG (Joint Picture Experts Group) === | + | The information below will help you better understand the file formats used in the SmartDublin catalog and how you can begin investigating their contents. |
- | * What: A common image format that usually produces smaller file sizes but at a loss in image quality/resolution. | + | |
- | * How: Most operating systems have built-in image-viewing applications to automatically open JPEGs, such as Microsoft Paint. Adopbe PhotoShop is more advanced software for viewing and editing. | + | |
- | === JSON (JavaScript Object Notation) === | + | === CSV (Comma-Separated Value) ★★★ === |
- | * What: JSON is an easily human and machine readable open standard format, which transmits data objects consisting of attribute-value pairs. GeoJSON is an extension of JSON that allows for the encoding of simple geographical features (points, lines and polygons) along with non-spatial attributes. TopoJSON itself extends GeoJSON by "stitching" together shared geometries (e.g. borders). This reduces file size and also facilitates certain visualizations. | + | * What is it: A tabular (spreadsheet) data format, where the column values are separated by commas. CSV files are both human and machine readable. You may see other "delimiters" used, including tabs. |
- | * How: GeoJSON files can be opened by desktop mapping applications and web-mapping tools, by R (with the right extensions), and by text editors. The online utility ogr2ogr supports conversion from many geospatial file formats into GeoJSON. | + | * Used by: Many applications, including Microsoft Excel, OpenOffice, and Google Docs and by text editors like Sublime Text, TextWrangler, Apple TextEdit, and Microsoft Notepad. When the CSV includes geographic coordinates, you may also open them in desktop mapping applications, such as QGIS, and with web-mapping tools, like GeoJSON.io and CartoDB. |
- | === KML (Keyhole Markup Language) === | + | === JSON (JavaScript Object Notation) ★★★ === |
- | * What: KML is an XML notation format used to express geographic information (longitude, latitude, altitude) on two- and three-dimensional maps. These files are easily readable by humans and machines. | + | * What is it: JSON is an easily human and machine readable open standard format, which transmits data objects consisting of attribute-value pairs. GeoJSON is an extension of JSON that allows for the encoding of simple geographical features (points, lines and polygons) along with non-spatial attributes. TopoJSON itself extends GeoJSON by "stitching" together shared geometries (e.g. borders). This reduces file size and also facilitates certain visualizations. |
- | * How: These files were originally developed for use with Google Earth. They can also be opened in a variety of other desktop GIS applications and web-mapping platforms. | + | * Used by: GeoJSON files can be opened by desktop mapping applications and web-mapping tools, by R (with the right extensions), and by text editors. The online utility ogr2ogr supports conversion from many geospatial file formats into GeoJSON. |
- | === PDF (Portable Document Format) === | + | === KML (Keyhole Markup Language) ★★★ === |
- | * What: A widespread document format developed by Adobe for sharing text and images in a fixed, un-layered layout. Text from PDFs created with optical-character recognition technology can be copied and pasted to a more open format. | + | * What is it: KML is an XML notation format used to express geographic information (longitude, latitude, altitude) on two- and three-dimensional maps. These files are easily readable by humans and machines. |
- | * How: Most modern web browsers (e.g. Firefox, Google Chrome, Opera, Safari), can open PDF files without additional plugins. There are several free applications for viewing (but not necessarily editing) PDFs, including Adobe Reader, Apple Preview, and OpenOffice. | + | * Used by: These files were originally developed for use with Google Earth. They can also be opened in a variety of other desktop GIS applications and web-mapping platforms. |
- | === PNG (Portable Network Graphics) === | + | === SHP (ESRI Shapefile) ★★★ === |
- | * What: PNG is a raster graphics file format that supports lossless data compression. The data catalog uses PNG format for general images. | + | * What is it: Shapefiles are one of the most common geospatial formats out there. Like GeoJSON, shapefiles can store both spatial geometries (points, lines and polygons) and other feature attributes. In our data catalog, you will find shapefiles zipped together with a few other files (with extension .shx, .dbf, .sbn). |
- | * How: PNG files can be opened in any image viewer and can be pasted into documents. | + | * Used by: In addition to ArcGIS, these files can be opened in free an open source GIS applications like QGIS. They can also be converted to many other data formats. |
- | === SHP (ESRI Shapefile) === | + | === PDF (Portable Document Format) ★ === |
- | * What: Shapefiles are one of the most common geospatial formats out there. Like GeoJSON, shapefiles can store both spatial geometries (points, lines and polygons) and other feature attributes. In our data catalog, you will find shapefiles zipped together with a few other files (with extension .shx, .dbf, .sbn). | + | * What is it: A widespread document format developed by Adobe for sharing text and images in a fixed, un-layered layout. Text from PDFs created with optical-character recognition technology can be copied and pasted to a more open format. |
- | * How: In addition to ArcGIS, these files can be opened in free an open source GIS applications like QGIS. They can also be converted to many other data formats. | + | * Used by: Most modern web browsers (e.g. Firefox, Google Chrome, Opera, Safari), can open PDF files without additional plugins. There are several free applications for viewing (but not necessarily editing) PDFs, including Adobe Reader, Apple Preview, and OpenOffice. |
- | === TIFF (Tagged Image File Format) === | + | === ZIP ★ === |
- | * What: TIFF is a computer file format used to store raster graphic images, which are made up of usually rectangular grids of pixels. In our data catalog, TIFFs are used for geographic images (e.g. satellite imagery or heat maps). These TIFFs include georeferencing information (or metadata) that allow you to project them onto a map and are referred to as GeoTIFFs. | + | * What is it: A Zip file is a compressed archive file that is used to make large files and collections of files more manageable to the user. When a .zip file is created, data is compressed to reduce the file size. Multiple files can be combined into a single Zip folder, making it easier to upload, download or email a volume of files. |
- | * How: TIFFs can be opened in image viewers, like Apple Preview. In the case of GeoTIFFs, you might find it useful to use GIS software, such as QGIS. | + | * Used by: If you are using Windows 10 or Mac, they support zip natively. Alternatives include 7-Zip (Windows); there are many third-party apps for Mac. |
- | === TXT (Text) === | ||
- | * What: Essentially just textual data without stylistic formatting commands. All metadata is currently stored in TXT format, though this may change when a new data management platform is adopted. | ||
- | * How: Can be opened in any text editor. |