Questions tagged [knowledge-discovery]

17 questions
9
votes
3 answers

Guidelines for discovering new knowledge in data

I plot something to make a point to myself or someone else. Usually, a question starts this process, and often the person asking hopes for a particular answer. How can I learn interesting things about the data in a less biased way? Right now I'm…
6
votes
1 answer

Knowledge graph: how to get into it?

I am looking for internships in the AI for drug discovery field and I came upon a new topic, as they describe in the website of a company: automatic reasoning in a knowledge graph. I tried to look for textbooks or even tutorials in the field of…
5
votes
1 answer

How to make use of known constants when modeling from data?

As a (perhaps contrived) example, let's say we want to discover from some empirical data Coulomb's law for an electric field: $$F = \frac{1}{4 \pi \epsilon_0} \cdot \frac{|q|}{r^2}$$ In this case we would have a matrix whose columns include force…
rhombidodecahedron
  • 2,322
  • 3
  • 23
  • 37
4
votes
2 answers

Data mining, domain knowledge and visualization

In my book (Introduction to data mining by Tan, Steinbach and Kumar - chapter 3), in the section about visualization, it's written: Another general motivation for visualization is to make use of the domain knowledge that is "locked up in people's…
Gigili
  • 755
  • 2
  • 8
  • 21
2
votes
1 answer

Book recommendation reqd - A book that teaches both, fundamentals of statistics and the related mathematical concepts, in a comprehensively

I am trying to build a foundation for data science through self study. Is there a book that could teach me the fundamentals of statistics along with the related mathematical concepts in a comprehensive manner? Preferably one that has sufficiently…
2
votes
1 answer

Book recommendations needed - building foundational knowledge for ISL - Introduction to Statistical Learning (by Gareth James)

I'm trying to build a data science base from scratch. I started a book called Introduction to Statistical Learning by Gareth James and found that there are many mathematical & statistical concepts that I'm unfamiliar with. I want to bridge this gap…
2
votes
0 answers

Cross-validation for parameter tuning in data mining process (KDD)

In my project I want to compare different classification algorithms to solve a specific problem with a specific dataset. To do this, I divided the dataset in 2 parts. With the first (bigger) part I am doing cross validation. In this step I try to…
1
vote
0 answers

Researcher who reinvented least squares regression? Urban legend?

I can't recall where I read about this. Supposedly, a young researcher (or student?) in the 1970s or 1980s independently rediscovered and published their methodology for ordinary least squares regression, completely ignorant of the fact that OLS…
RobertF
  • 4,380
  • 6
  • 29
  • 46
1
vote
1 answer

Verification goal of KDD? Cases where it apply

I'm having difficulties in finding examples of a Verification Data-mining/KDD approach. Let me explain what I mean by that. In the notorious article From Data Mining to KDD by Fayyad et al the following definition is mentioned: "The knowledge…
Homunculus
  • 113
  • 4
1
vote
1 answer

Dempster Shafer Evidence Theroy

I am trying to combine the predicted class labels of 3 different classifiers (for example SVM, Naive Bayes etc ) so that it will eliminate the weakness of each individual classifier. I am trying to use dempster Shafer evidence theory in order to…
Ambarish
  • 119
  • 1
  • 7
1
vote
3 answers

What kind of results are there about prior knowledge?

Consider the following two problems: We want to do linear regression, where we know that the true model is actually linear. In the general case, linear regression includes an offset. However, we know, using prior knowledge, that the true model…
modeler
  • 11
  • 2
0
votes
0 answers

Is there a concept for "almost parallel" edge?

By "almost parallel" I mean if set S of closing nodes has many conflicting edges with set T of closing nodes, then the edges between them are almost parallel. The emphasized edges in this image illustrates that: Is there aany algorithm to detect…
Ooker
  • 267
  • 3
  • 13
0
votes
1 answer

What metric should I use for evaluating a reading order of text tokens given the correct ordering?

Given an ordering of tokens extracted from a document with a ground truth ordering available. What would be the correct way to evaluate the ordering? I took a look at some Machine Translation evaluation metrics such as Word Error Rate and the BLEU…
0
votes
1 answer

Estimate number of objects in an environment given agent's observations

I'm solving a reinforcement learning-like problem, where I have an agent trying to survive in a 2D room. These room contains a finite and constant number of moving objects that interact with an agent. The number of objects is not known to the agent,…
0
votes
1 answer

procedures to create classes for classification?

The title isn't clear, That's because the problem itself is weird for me. I was asked to create categories that will later be used in statistical learning. That means that the classification algorithms that will be used will classify feature…
nidabdella
  • 169
  • 1
  • 12
1
2