{"id":8027,"date":"2020-10-01T00:00:00","date_gmt":"2020-10-01T04:00:00","guid":{"rendered":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/"},"modified":"2024-10-14T09:23:54","modified_gmt":"2024-10-14T13:23:54","slug":"data-cataloging","status":"publish","type":"glossary","link":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/","title":{"rendered":"Data Cataloging"},"content":{"rendered":"\r\n<ul class=\"anchorlinks wp-block-list\">\r\n<li><a href=\"#what\">What is data cataloging?<\/a><\/li>\r\n<li><a href=\"#setup\">How to set up a data catalog?<\/a><\/li>\r\n<li><a href=\"#benefits\">What is data cataloging good for?<\/a><\/li>\r\n<li><a href=\"#types\">Types of data catalogs<\/a><\/li>\r\n<li><a href=\"#summary\">Summary<\/a><\/li>\r\n<\/ul>\r\n<h2 id=\"what\">What is data cataloging?<\/h2>\r\n<p>Data cataloging is the process of making an organized inventory of your data. Once you\u2019ve completed your data mapping process, the data catalog (think card catalog in a library) is what you\u2019ll use to index where everything is stored.<\/p>\r\n<p>It uses metadata (aka the data about your data), to collect, tag, and store datasets. Your datasets may be stored in a data warehouse, data lake, master repository, or another storage location. Most enterprise companies choose to use cloud storage for their data.<\/p>\r\n<p>The greatest advantage of a well-organized data catalog is the access to insights it will give you, now that your data is labeled correctly and easy to find. A data catalog allows you to see all of the available datasets, quickly identify what you\u2019re looking for, and evaluate and analyze efficiently and with confidence.<\/p>\r\n<p>Properly done, data cataloging gives you visibility over all your data and a single source of truth across all your data stores. Basically, if your organization needs to analyze and leverage a continually expanding storehouse of data \u2014 it needs a data catalog.<\/p>\r\n<h2 id=\"setup\">How to set up a data catalog?<\/h2>\r\n<p><strong>The first step<\/strong> to data cataloging is collecting your metadata, including tags, files, labels, and tables. That\u2019s what your data catalog will consist of (it won\u2019t be storing the actual data). You can set up the software to crawl your databases to gather this information, from places like your data warehouses, cloud-based systems like AWS, data storage platforms like Hadoop, and other BI solutions, transactional databases that use SQL, and those that use NoSQL like MongoDB.<\/p>\r\n<p><strong>Next,<\/strong> you\u2019ll build a data dictionary, to serve as an index for easy identification and ultimately, retrieval. These have become more popular with the surge in usage of BI platforms like Sisense.<\/p>\r\n<p>Data analysts and business users are also recognizing the value of data dictionaries. These less technical users appreciate the ability to assess the relevance of a certain dataset without diving in too deep.\u00a0The data catalog then delivers context to what\u2019s in the dictionary, with its improved capabilities for automation, discovery, and classification.<\/p>\r\n<p><strong>The next step is<\/strong> implementing a BI platform like Sisense, to give you more efficient ways to interact with your data. You can manage and add to your data catalog directly inside the BI platform.<\/p>\r\n<h2 id=\"benefits\">What is data cataloging good for?<\/h2>\r\n<p><strong>Proper data cataloging can help ease<\/strong> the data compliance and governance burden in your organization. You can set up tools and labeling that relate to PII, data privacy, and reporting. These may help you to organize and retrieve information in a way that keeps you in line with <a href=\"https:\/\/www.hhs.gov\/hipaa\/index.html\" target=\"_blank\" rel=\"noopener\">HIPAA<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Dodd%E2%80%93Frank_Wall_Street_Reform_and_Consumer_Protection_Act\" target=\"_blank\" rel=\"noopener\">Dodd-Frank<\/a>, <a href=\"https:\/\/gdpr-info.eu\/\" target=\"_blank\" rel=\"noopener\">GDPR<\/a>, and other key regulations.<\/p>\r\n<p><strong>From an accuracy point of view<\/strong>, data cataloging can help you sort out the most relevant and updated information by standardizing the way you store and label data. You can make clear and consistently defined definitions and attributes to create a comprehensive information system that even non-technical users can benefit from.<\/p>\r\n<p><strong>One more benefit of data cataloging:<\/strong> it will help you improve and maintain data quality by ensuring dependable usage of data elements and encouraging transparency. The users of your data catalog must be confident that they are not creating models and reports with bad data.<\/p>\r\n<h2 id=\"types\">Types of data catalogs<\/h2>\r\n<p>When it comes to organizing big data, there\u2019s no such thing as a one-size-fits-all approach. Gartner identifies three distinct subcategories of data catalogs, so you can determine which type is right for your company\u2019s situation:<\/p>\r\n<ul>\r\n<li><strong>Tool-specific or vendor data catalogs<\/strong><\/li>\r\n<\/ul>\r\n<p>These data catalogs may be delivered as part of a cloud-based data lake, data preparation tool, or Hadoop distribution. This method requires little input on the part of the organization, but has its limits, since you may end up with multiple data catalogs as your list of vendors grows. This makes it more laborious when it comes time to plug in a BI solution and set up your single source of truth.<\/p>\r\n<ul>\r\n<li><strong>Data catalogs specifically meant for data lakes<\/strong><\/li>\r\n<\/ul>\r\n<p>This type of data catalog is used primarily by data scientists and data engineers. This type of use case, while thorough, has limited adaptability across the organization and doesn\u2019t easily allow for business users to access the data and leverage it for their own digital initiatives.<\/p>\r\n<ul>\r\n<li><strong>Enterprise data catalogs for analysis and teamwork<\/strong><\/li>\r\n<\/ul>\r\n<p>Gartner defines these as \u201cgeneralist, business-oriented data catalogs for broader use in information governance and infonomics \u2013 targeted at the Chief Data Officer (CDO).\u201d<\/p>\r\n<h2 id=\"summary\">In summary<\/h2>\r\n<p>A cleaner, faster, and more transparent analysis is at your fingertips with a well-organized data catalog. Your data catalog should empower your employees to get better data insights and make smart decisions quickly. This will set your organization on its way to becoming truly data-driven.<\/p>\r\n<p><a class=\"action-btn \" href=\"\/resources\/glossary\/\">Back to Glossary<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.<\/p>\n","protected":false},"featured_media":8154,"template":"","meta":{"_acf_changed":false,"_searchwp_excluded":"","_links_to":"","_links_to_target":""},"application":[],"department":[],"glossary-category":[],"industry":[],"role":[],"topic":[],"class_list":["post-8027","glossary","type-glossary","status-publish","has-post-thumbnail","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.5 (Yoast SEO v23.8) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Data Cataloging: Meaning, Benefits, and Tools - Sisense<\/title>\n<meta name=\"description\" content=\"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Cataloging\" \/>\n<meta property=\"og:description\" content=\"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/\" \/>\n<meta property=\"og:site_name\" content=\"Sisense\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-14T13:23:54+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/header-logo.png\" \/>\n\t<meta property=\"og:image:width\" content=\"154\" \/>\n\t<meta property=\"og:image:height\" content=\"32\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/header-logo.png\" \/>\n<meta name=\"twitter:site\" content=\"@sisense\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/\",\"url\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/\",\"name\":\"Data Cataloging: Meaning, Benefits, and Tools - Sisense\",\"isPartOf\":{\"@id\":\"https:\/\/www.sisense.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png\",\"datePublished\":\"2020-10-01T04:00:00+00:00\",\"dateModified\":\"2024-10-14T13:23:54+00:00\",\"description\":\"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage\",\"url\":\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png\",\"contentUrl\":\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png\",\"width\":570,\"height\":398,\"caption\":\"Open source data mapping\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.sisense.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data Cataloging\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.sisense.com\/#website\",\"url\":\"https:\/\/www.sisense.com\/\",\"name\":\"Sisense\",\"description\":\"Build your business with anywhere-analytics\",\"publisher\":{\"@id\":\"https:\/\/www.sisense.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.sisense.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.sisense.com\/#organization\",\"name\":\"Sisense\",\"url\":\"https:\/\/www.sisense.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sisense.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/sisense-yoast-og.jpg\",\"contentUrl\":\"https:\/\/cdn.sisense.com\/wp-content\/uploads\/sisense-yoast-og.jpg\",\"width\":1200,\"height\":600,\"caption\":\"Sisense\"},\"image\":{\"@id\":\"https:\/\/www.sisense.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/sisense\",\"https:\/\/www.linkedin.com\/company\/sisense\",\"https:\/\/github.com\/sisense\/\"],\"description\":\"Sisense accelerates product innovation through AI\/ML capabilities. Our global analytics platform lets customers drive better, faster decisions for their business and end users.\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Data Cataloging: Meaning, Benefits, and Tools - Sisense","description":"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/","og_locale":"en_US","og_type":"article","og_title":"Data Cataloging","og_description":"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.","og_url":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/","og_site_name":"Sisense","article_modified_time":"2024-10-14T13:23:54+00:00","og_image":[{"width":154,"height":32,"url":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/header-logo.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_image":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/header-logo.png","twitter_site":"@sisense","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/","url":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/","name":"Data Cataloging: Meaning, Benefits, and Tools - Sisense","isPartOf":{"@id":"https:\/\/www.sisense.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage"},"image":{"@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png","datePublished":"2020-10-01T04:00:00+00:00","dateModified":"2024-10-14T13:23:54+00:00","description":"Data cataloging is the process of making an organized inventory of your data. The data catalog is what you\u2019ll use to index where everything is stored.","breadcrumb":{"@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sisense.com\/glossary\/data-cataloging\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#primaryimage","url":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png","contentUrl":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/Open-source-data-mapping.png","width":570,"height":398,"caption":"Open source data mapping"},{"@type":"BreadcrumbList","@id":"https:\/\/www.sisense.com\/glossary\/data-cataloging\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sisense.com\/"},{"@type":"ListItem","position":2,"name":"Data Cataloging"}]},{"@type":"WebSite","@id":"https:\/\/www.sisense.com\/#website","url":"https:\/\/www.sisense.com\/","name":"Sisense","description":"Build your business with anywhere-analytics","publisher":{"@id":"https:\/\/www.sisense.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sisense.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.sisense.com\/#organization","name":"Sisense","url":"https:\/\/www.sisense.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sisense.com\/#\/schema\/logo\/image\/","url":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/sisense-yoast-og.jpg","contentUrl":"https:\/\/cdn.sisense.com\/wp-content\/uploads\/sisense-yoast-og.jpg","width":1200,"height":600,"caption":"Sisense"},"image":{"@id":"https:\/\/www.sisense.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/sisense","https:\/\/www.linkedin.com\/company\/sisense","https:\/\/github.com\/sisense\/"],"description":"Sisense accelerates product innovation through AI\/ML capabilities. Our global analytics platform lets customers drive better, faster decisions for their business and end users."}]}},"_links":{"self":[{"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/glossary\/8027"}],"collection":[{"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/glossary"}],"about":[{"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/types\/glossary"}],"version-history":[{"count":0,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/glossary\/8027\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/media\/8154"}],"wp:attachment":[{"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/media?parent=8027"}],"wp:term":[{"taxonomy":"application","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/application?post=8027"},{"taxonomy":"department","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/department?post=8027"},{"taxonomy":"glossary-category","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/glossary-category?post=8027"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/industry?post=8027"},{"taxonomy":"role","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/role?post=8027"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.sisense.com\/wp-json\/wp\/v2\/topic?post=8027"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}