Dark Data

Dark Data

Dark data may be a sort of unstructured, unlabeled  and untapped information that’s found in information repositories and has not been analyzed or processed. it’s like huge information however differs in however it’s principally neglected by business and IT directors in terms of its price.

Dark information is additionally called dirty information.

Dark information is information that’s found in log files and information archives keep among giant enterprise category information storage locations. It includes all information objects and kinds that have however to be analyzed for any business or competitive intelligence or aid in business higher cognitive process. Typically, dark information is complicated to investigate and keep in locations wherever analysis is tough. the method will be pricey. It can also embrace information objects that haven’t been condemned by the enterprise or information that square measure external to the organization, like information keep by partners or customers.

IDC, a groundwork firm, expressed that up to 90% of huge information is dark information.

Dark information may be a set of huge information however it constitutes the most important portion of the entire volume of huge information collected by organizations during a year. Dark information isn’t sometimes analyzed or processed as a result of numerous reasons by corporations however that doesn’t reduce its importance within the context of business price. There square measure 2 ways in which to look at the importance of dark information. One read is that crude information contains undiscovered, vital insights and represents a chance lost. the opposite read is that crude information, if not handled well, may end up during a heap of issues like legal and security issues.

What is dark data?

Organizations gather large volumes of information that, they believe, can facilitate improve their merchandise and services. for instance, an organization could collect information on however users use its merchandise, internal statistics regarding computer code development processes, and web site visits. However, an oversized portion of the collected information square measure ne’er even analyzed. consistent with IDC, ninetieth of the unstructured information square measure ne’er analyzed. Such information is thought as dark information. consistent with Gartner, dark information is “the data assets organizations collect, process, and store throughout regular business activities, however typically fail to use for different functions.” although the classes of dark information could vary across firms, the subsequent classes of unstructured information typically square measure thought of dark data:

  • Customer Information
  • Log Files
  • Previous Employee Information
  • Raw Survey Data
  • Financial Statements
  • Email Correspondences
  • Account Information
  • Notes or Presentations
  • Old Versions of Relevant Documents

Why dark knowledge is handled the means it is?

It is shocking as a result of at the time of knowledge assortment, the businesses assume that the info goes to supply worth. corporations invest plenty on knowledge assortment thus each monetarily and otherwise, knowledge ought to be thought of vital. Here area unit many reasons why there’s such a lot of dark knowledge.

Lopsided priorities

Take the instance of a bank analyzing on-line applications for credit cards. The master card promoting team is concentrated alone on client details and eligibility however no attention is paid to the info on however the client found the appliance page. The unattended knowledge might have provided valuable insights on the usability of the bank web site and therefore the application page. however there’s no priority assigned  to the present side.

Disconnect among departments

In massive organizations, departments have their own knowledge assortment and storage processes which can not be notable to alternative departments. So, data, even though relevant to alternative departments, lie unused. this is often a method issue clearly.

Technology and power constraints

If knowledge assortment is finished by separate technologies and tools within the same organization, there is also cases that these technologies and tools don’t act with one another owing to technological constraints. This prevents transfer all the info along and making a cohesive image. This happens particularly for corporations that have completely different IT systems and formats. as an example, it’s going to be tough to integrate audio file contents from call center with click knowledge from websites. corporations that area unit at the first stages of an information analytics program face these issues.

Importance of dark knowledge

It has been explicit  earlier that there area unit 2 ways that to look at the importance of dark knowledge. we’ve examined the various views below:

Perspective of chance not accessed

The area shown in black within the image below indicates dark knowledge. The image illustrates the notional share of dark knowledge that’s gift at any time.

Dark information represents a large chance for firms to realize valuable insights which might drive their business. Take a glance at the subsequent examples:

  • Server log files will give web site traveler behavior.
  • Customer decision detail records reveal client sentiments and feelings.
  • Mobile geo-location information will give traffic patterns.

Companies area unit holding go of opportunities by not sound into dark information. it’s conjointly true that they have higher processes, coordination and technologies to fittingly use dark information.

Perspective of issues dark information will cause

Dark information will cause legal, monetary and different issues if it’s not handled tolerably. In fact, firms with pile dark information area unit already staring into problems. firms might face the subsequent problems with dark data:

Legal and regulative problems

If {the information|the info|the information} keep is roofed by legal rules like master card data, exposure of such information might throw firms into monetary and legal liabilities.

Intelligence risk

Companies might, through deliberate or unintended disclosures, lose proprietary or sensitive information on business operations, products, monetary standing and business plans. this might adversely impact the business.

Loss of name

Companies area unit viewed as custodians of information they collect. So, any loss of information, particularly sensitive and confidential information, may end up in a very loss of name.

Opportunity prices

If a corporation decides to not invest within the analysis and process of dark information however its competitors do, its competitors area unit additional possible to in. ahead within the competition thanks to the usage of insights from dark information. that’s the price the corporate is paying thanks to lost opportunities.

Better ways to handle dark data

Either method you read dark information — as a chance or a mirrored image of issues, you can’t deny its importance. the perfect thanks to handle dark information is to utilize it well. however that will not be simple, considering the investments required. Still, there has to be a begin. Unused information might render a number of it redundant over time. Also, it’s unlikely that each one of the dark information are going to be valuable. So, you ought to neither toss away all of the dark information nor take into account all of it a goldmine. Here area unit some ways in which to urge the simplest out of dark information.

  • Regularly audit and prune the information. This suggests that you simply ought to be structuring or assignment classes to the recent information so you recognize what quite information is keep and wherever. you are doing not have to be compelled to dump that information. With storage changing into cheap, there’s no got to dump information. Later, you will suddenly would like the information and since you’ve got organized the information well, you’ll notice it quickly.
  • Apply sturdy cryptography standards on the information. This could be applicable each for information sitting within the in-house servers and also the cloud storage. Cryptography will forestall a great deal of security problems with information.
  • Have information retention and safe disposal policies in situ. The policies ought to be aligned with the prescriptions of the Department of Defense. Fastidiously formulate policies distinctive information for erasure or destruction. Smart retention policies can assist you retain valuable information for later use.


Dark information definitely represents unused opportunities that several firms area unit holding go of thanks to method, investment and technology constraints. In a sense, this failure to use dark information conjointly makes huge information assortment,  that could be a huge exercise, a partial failure. Though the investments required to faucet dark information potential could also be expensive, the hassle is well worth the investment. And, though firms value more highly to simply sit on dark information and do nothing, they’re really exposing themselves to many risks, as represented earlier. The key’s to try to one thing regarding dark information and not treating it as dead, useless issue