Questions tagged [weka]

Weka (Waikato Environment for Knowledge Analysis) is a collection of machine learning algorithms for data mining tasks.

From: The University of Waikato

"...The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this, and the bird sounds like this.

Weka is open source software issued under the GNU General Public License."

201 questions
49
votes
1 answer

How to interpret error measures?

I am running the classify in Weka for a certain dataset and I've noticed that if I'm trying to predict a nominal value the output specifically shows the correctly and incorrectly predicted values. However, now I'm running it for a numerical…
FloIancu
  • 593
  • 1
  • 5
  • 6
15
votes
1 answer

Principal Component Analysis Vs Feature Selection

I am doing a machine learning project using WEKA. It is a supervised classification and in my basic experiments, I achieved very poor level of accuracy. Then my intention was to do a feature selection, but then I heard about PCA. In feature…
vigamage
  • 283
  • 3
  • 9
13
votes
4 answers

Classifier for uncertain class labels

Let's say I have a set of instances with class labels associated. It does not matter how these instances were labelled, but how certain their class membership is. Each instancs belongs to exactly one class. Let's say I can quantify the certainty of…
grssnbchr
  • 634
  • 1
  • 7
  • 25
7
votes
2 answers

Interactive decision trees

I was wondering if there is a free tool to build a decision tree in interactive fashion like in SAS Enterprise Mining. I'm used to work with Weka. But nothing fits to my needs. I would like that before splitting every node, the program asks to user…
Simone
  • 6,513
  • 2
  • 26
  • 52
6
votes
1 answer

How to interpret Weka Logistic Regression output?

Please help interpret results of logistic regression produced by weka.classifiers.functions.Logistic from the WEKA library. I use numeric data from WEKA examples: @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature…
Anton Ashanin
  • 295
  • 1
  • 3
  • 9
6
votes
1 answer

How to decide which decision tree classifier to use?

I am confused about which decision tree algorithm in weka to use for my application. I have 5 real input variables and 2 classes. In various online tutorials J48 (C 4.5) seems to be the algorithm of choice. Are there any rules of thumb / tips /…
user13107
  • 253
  • 3
  • 13
6
votes
2 answers

Is cross-validation an effective approach for feature/model selection for microarray data?

I've been working with WEKA to build class predictors using this (rather old..) breast cancer dataset. The dataset is divided into a training and a test set. I've been testing different learning schemes (mostly focused on feature selection) using…
Ben
  • 81
  • 4
6
votes
3 answers

Data mining classification competition

I'm currently taking a data mining class, and for one our projects we're required to predict the class label for an unknown data set by first building a classifier on a training data set which already provides the class label. We're only required…
LearnHK
5
votes
1 answer

Interpretation of a one cluster solution using the EM cluster algorithm

I'm trying to use the EM cluster algorithm, provided by the software Weka, to classify my data and it only finds one cluster. Could I interpret this as there are no ways to distinguish the instances in my sample? This is a result that is…
Rafael
  • 185
  • 4
5
votes
1 answer

What is the right attitude toward open source machine learning toolkits?

There are lots of machine learning toolkits nowadays, such as weka, sklearn, R libs. If we choose to use these toolkits, besides that it is convenient, sometimes we might lose control of what is really happening. For example, in some learning…
5
votes
3 answers

Cross Validation with Preprocessing (Normalization, Discretization, Feature Selection)

I am now trying to evaluate my model with cross validation. My dataset contains some numeric and nominal attributes. Here, I carry out the following data preprocessing tasks: A. Normalization: Min-Max Normalization (to [0,1]) B. Discretization:…
5
votes
1 answer

Optimizing for target metrics in Weka

I'm a PhD student in Information Retrieval with some limited experience in ML. We've been working on a binary classification task with weka (I'm using weka programmatically via Java), specifically with Random Forest. Our results are coming out a…
shiri
  • 253
  • 2
  • 7
4
votes
1 answer

Unsupervised Random Forest using Weka

I am having some issues understanding how unsupervised Random Forest works according to Breiman. I only have unlabeled data, so the thought arose to use unsupervised Random Forest and use the resulting dissimilarity matrix as input for a cluster…
Lex88
  • 41
  • 3
4
votes
1 answer

Obtaining R pec survival patient risk percentage

Introduction I have a 300,000-row cancer dataset with around 60 variables (cancer stage, year of diagnosis, radiation therapy, histology, etc.) with a time variable ("number of months survived") and an event (alive or dead). The last two variables…
4
votes
2 answers

Sweeping across multiple classifiers and choosing the best?

I'm using Weka to perform classification, clustering, and some regression on a few large data sets. I'm currently trying out all the classifiers (decision tree, SVM, naive bayes, etc.). Is there an automated way (in Weka or other machine learning…
1
2 3
13 14