Questions tagged [rapidminer]

RapidMiner is a software platform that provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics.

RapidMiner is a software platform developed by the company of the same name that provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics. It is used for business and industrial applications as well as for research, education, training, rapid prototyping, and application development and supports all steps of the data mining process including results visualization, validation and optimization. RapidMiner is developed on a business source model which means the core and earlier versions of the software are available under an OSI-certified open source license. [Wikipedia]

24 questions

votes

1 answer

How can I find out if a subset of Stack Exchange users increase/decrease their post rate based on badges earned?

I'm trying to mine that Stack Exchange data dump to find out whether there is a cluster of users that may be positively or negatively affected by the number of badges they've been awarded. The theory I'm working on is that for some people who were…

clustering data-mining rapidminer

asked Apr 13 '12 at 19:36

cflewis

votes

1 answer

Sliding window validation for time series

I have a broad question about sliding window validation. Specifically, I am looking at using Rapid Miner to predict future values of a financial series using "lagged" values of that series and other covariates. I have been experimenting with the…

time-series data-mining rapidminer

asked Jun 13 '11 at 13:09

B_Miner

7,560
20
81
144

votes

1 answer

How to choose a data subset in RapidMiner?

I'm working with a CSV which contains approximately 220,000 entries. My aim is to predict one of the attributes (ATT1) using the other 3 (ATT2, ATT3, ATT4). I've been able to do this using NaiveBayes, but now I feel unsatisfied with the result. The…

dataset rapidminer

asked Apr 09 '11 at 12:49

Gurzo

votes

1 answer

Prediction of soccer matches / process setup and optimization

Currently I'm working on my master thesis about the application of data mining in football, I'm trying to predict matches based on some stats of the two involved Teams (using RapidMiner). My use case is the German Bundesliga and I will predict the…

predictive-models prediction model-evaluation rapidminer

asked Aug 18 '16 at 19:36

Andreas Brauchle

votes

1 answer

Visualizing large file-based or Redis-in-memory stored large datasets (millions of data points)

I am very active at StackExchange's QuantFinance forum but thought this question is more suitable to be asked here. I am generating large time series data and store them in-memory in Redis (alternatively could also save to disk in any format) and…

r data-visualization large-data rapidminer c#

asked Jul 06 '13 at 09:38

Matt

votes

2 answers

How can I detect when a key was pressed with accelerometer or gyroscope data?

I have a dataset (~20k samples) of sensor data gathered from a smartphone. What I want to do with it is to detect those spikes you can see in the graphs below. They occur when the user presses a button. I want to label the data that refers to those…

r predictive-models filter anomaly-detection rapidminer

asked Dec 29 '16 at 16:03

keinabel

votes

0 answers

Generating (forcing) confidence percentages in RapidMiner

I have a dependent variable (my 'label' in RapidMiner terms), that is a binary classification expressed as 'WIN' or 'NOTWIN'. I know 'NOTWIN' is exhibited in about 90% of all observations. When I try to run a K Nearest Neighbors approach, the…

rapidminer

asked Nov 26 '12 at 04:33

Brett

votes

1 answer

Information Gain vs Gain Ratio

In the building of a decision tree, when it's better to prefer the information gain criterion to the gain ratio criterion ? And why ?

data-mining cart mutual-information rapidminer

asked May 16 '18 at 08:53

Qwerto

votes

1 answer

Low recall and high precision in text summarization

We are trying to generate a model to summarize Persian news. About 14000 news were summarized with help of humans(supervised) and then we extracted all sentences (about 180000) and labeled them (true if were selected in summarization, false if not).…

classification text-mining maximum-entropy rapidminer

asked Apr 04 '16 at 18:49

Oli

votes

1 answer

Tool form Hierarchical clustering

I'm trying to perform a hierarchical Clustering Analysis in a dataset of 40 attributes and +70,000 records, which is mostly composed by categorical variables. I've used Matlab and RapidMiner to execute the analysis but among their poor peformance…

feature-selection matlab hierarchical-clustering rapidminer

asked Oct 07 '14 at 06:45

formacero10

vote

1 answer

How to re-cluster new instance in centroid base clustering?

I have applied clustering algorithms like k-mean, k-medoid and DBSCAN on my patients dataset. For each algorithm RapidMiner generated a clustered model (centroid table and graphs etc) and a clustered set (shows which examples are part of which…

clustering k-means rapidminer

asked Jul 08 '13 at 19:51

user1015347

vote

1 answer

Loop over Tokens in RapidMiner's Text Processing Plugin

is there any possibility to iterate over the tokens of a text document within RapidMiner? My first try was to window the document after tokenisation. But this seems very complicated. I'm doing this to simulate the creation of a language model like…

text-mining natural-language rapidminer

asked Apr 05 '13 at 18:09

Andreas

vote

0 answers

Problem with unequal distribution of classes in sentiment classification

I am performing a binary sentiment classification (positive/negative) with RapidMiner. My problem is that I have about 400 positive and 1350 negative documents. I get pretty good accuracy but therefore my precision and recall for the positive class…

classification rapidminer

asked Dec 29 '12 at 00:37

user18075

vote

2 answers

How Rapidminer handle same distance for KNN Algorithm

Actually I already asked in rapidminer forum, but no one has given an answer yet.. https://community.rapidminer.com/discussion/55963/how-k-nn-algorithms-work-with-same-distance-in-rapidminer#latest I can't find a satisfying answer for KNN-algorithm…

k-nearest-neighbour euclidean rapidminer

asked Aug 13 '19 at 01:17

AdeMuchlis

vote

2 answers

Different prediction score for two SVM-based classifiers

As a validation study, I use two libsvm-based svm classifier against the same data set. One classifier is libsvm implementation in Rapidminer. Another classifier is Libsvm itself. Both of them assume the same parameter setting. However, the…

machine-learning svm libsvm rapidminer

asked Sep 27 '12 at 18:17

user785099

1,105
3
14
24

2 Next