Questions tagged [dataset]

1

votes
1

answer
2.2k

Views

Showing parent datatable in one datagridview and show child datatable elements in another?

Hi I have created a dataset and two datatables. The URLData table is parented to the KeywordData table via the keyword column. 'Create the primary object to hold both data tables Dim AllKeyWordInfo As DataSet = New DataSet('AllKeywordInfo') 'Create a Datatable For keyword Data. Data Tables are store...
Zach Johnson
0

votes
0

answer
5

Views

how to create an real time Dataset on a certain topic

Im working on a project which is data mining on social media. But i found that Facebook is limiting the developer to get information. The question is there any possible ways to create dataset from Facebook Graph API? If there is unable to do that, how do i get started of web crawling on social medi...
Novice
0

votes
1

answer
17

Views

Why is spark.implicits._ is embedded just before converting any rdd to ds and not as regular imports?

I am learing spark datasets and checking how can we convert an rdd to a dataset. For this, i got the following code: val spark = SparkSession .builder .appName('SparkSQL') .master('local[*]') .getOrCreate() val lines = spark.sparkContext.textFile('../myfile.csv') val structuredData = lines.map(mappe...
KayV
0

votes
0

answer
3

Views

How do companies like jwplayer, Vimeo, and Google create a dataset for certifying their encoding presets?

Bunch of sites give an encoding recommendation for the videos being uploaded. Example : youtube, jwplayer, vimeo How do they create their datasets? Dataset has to be representative of all the kind of content which can be served at whatever resolution. Is it algorithmic or manual process to pick vid...
gegupta
1

votes
3

answer
327

Views

Counting number of occurrences on the whole Pandas DataFrame

My dataset has this structure A = [A1, A2, A3, A4] B = [B1, B2, B3] C = [C1, C2, C3, C4, C5] I want to count the occurrences of all variables in my dataset, such as: A1 3 A2 2 A3 1 ... C4 4 C5 5 I have tried df.groupby(df.columns[0]).A.count() but it only works column by column, is th...
Thanh Nguyen
1

votes
2

answer
1.4k

Views

How to drop malformed rows while reading csv with schema Spark?

While I am using Spark DataSet to load a csv file. I prefer designating schema clearly. But I find there are a few rows not compliant with my schema. A column should be double, but some rows are non-numeric values. Is it possible to filter all rows that are not compliant with my schema from DataSet...
HouZhe
1

votes
1

answer
374

Views

How can i assign different color for different bars in mpandroidchart?

I want to assign color for different bars. In my code, it assign color for barDataSets but i want to assign for bar entrys'. How can i do that? Thank You! private ArrayList getDataSet() { ArrayList dataSets = null; //1. Cubuk ArrayList valueSet1 = new ArrayList(); BarEntry v1e1 = new BarEntry(10.000...
Emre Önder
1

votes
1

answer
65

Views

Number of table count is wrong in dataset

I am supplying 3 servers to loop however the $mdtable.table.count is only 1. I must be missing a simple thing here. Can anyone please help me resolve this? Get-Content 'C:\test\computers.txt' | ? { $_.Trim() -ne '' } | ForEach-Object { $value = Invoke-Command -Computer $_ -ScriptBlock { Param($compu...
Saikiran Paramkusham
1

votes
1

answer
278

Views

Binarize the ratings - MovieLens dataset

I am working on a personalised news recommendation engine based on click-behaviour of users. My features will be predefined news categories (such as politics, sport and etc). Whenever user clicks on an article, I build/update user profile based on this article, then recommend another article from ar...
Ramin
1

votes
1

answer
1.3k

Views

How to training/testing my own dataset in caffe?

I started with Caffe and the mnist example ran well. I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat). However, I don't quite understand how I can use caffe to implement my own dataset? Is there a step by ste...
1

votes
3

answer
807

Views

How To Read MVS System Catalog To Retrieve GDG Information?

I have a job (JCL) on the mainframe where I want to programmatically retrieve a particular GDG file's recent relative generation numbers from the system catalog (API call)...where I can then programmatically dig thru the results returned by the call to figure out the relative generation numbers. Th...
user278458
1

votes
1

answer
584

Views

Sentiment Analysis - What does annotating dataset mean?

I'm currently working on my final year research project, which is an application which analyzes travel reviews found online, and give out a sentiment score for particular tourist attractions as a result, by conducting aspect level sentiment analysis. I have a newly scraped dataset from a famous trav...
Mahesh De Silva
1

votes
1

answer
46

Views

Difference between DBNull.Value and IsValueNull()

I'm getting some data from Stored Procedure to DataSet and then copying that data to a List. There are some NULL values in data and for that I'm checking with DBNULL.Value. But whenever it comes to a NULL value, it gives me error ERROR : Specified cast is not valid. This is how I'm copying data fro...
Null Pointer
1

votes
2

answer
1.1k

Views

HTML5 nested data-* attributes parsed with Javascript don't return a nested object

I am stuck in a concept of html5 data attributes. That attributes allows nesting like: I have seen plugins in the past (like select2) and some of them use the following similar syntax to make an AJAX call: This code in background converts to a dataset in javascript and it returns something like this...
1

votes
2

answer
567

Views

How to obtain the number of records of a dataset in SAS

I want to count the number of records in a dataset in SAS. There is a function the make this thing in a simple way? I used R ed for obtain this information there was the length() function. Morover I need the number of record to compute some percetages so I need this value not in a table but in a val...
Giacomo Rosaspina
1

votes
2

answer
46

Views

Split columns and write to separate output file

I have a dataset with 8 columns and about 5 million rows. The size of the file is more than 400 mb. I am trying to separate columns. The file extension is .dat and columns are one-space separated. Input: 00022d3f5b17 00022d9064bc 1073260801 1073260803 819251 440006 819251 440006 00022d9064bc 00022db...
Sitz Blogz
1

votes
2

answer
2.2k

Views

What is the best practice to collect a large data set from spark rdd?

I am using pyspark to process my data and at the very end i need collect data from rdd using rdd.collect(). However, my spark crashes due to the memory problem. I tried a number of ways, but no luck. I am now running with the following code, process a small chunk of data for each partition: def make...
JamesLi
1

votes
1

answer
297

Views

Gnuplot: Plotting several datasets with titles from the pipe

As a follow up of: Gnuplot: Plotting several datasets with titles from one file, I have a test.dat file: 'p = 0.1' 1 1 3 3 4 1 'p = 0.2' 1 3 2 2 5 2 and I can plot it with no issues from within gnuplot using: > plot for [IDX=0:1] 'test.dat' i IDX u 1:2 w lines title columnheader(1) however I cannot...
DarioP
1

votes
1

answer
2.6k

Views

Getting a dataset of Photos and hashtags of Instagram

I am trying to come up with a dataset of public photos and some random hashtags regarding them from Instagram. Is there any API for that? Also is there a dataset for a list of hashtags or object vocabulary? Best
Aida Amini
1

votes
2

answer
173

Views

Foreach loop only updating first row

I am trying to create an insert statement for certain records in the database. So the DataTablecould return 100 rows but in the foreach loop it is running the insert 100 times into the first row. I want the insert to run for each row in the DataTable. string status = @'select j.*, ' + ' (SELECT Sta...
user123456789
1

votes
1

answer
66

Views

matlab: how to find interval of data

I have a dataset of trajectories of users: every current location of the traiectories has these fields:_ [userId year month day hour minute second latitude longitude regionId]. Based on the field day, I want to divide trajectories based on daily-scale in interval of different hours: 3 hours, 4 hours...
elis56
1

votes
4

answer
97

Views

How to change sub element of data attribute using jquery

I want to change sub element of below data attribute for this i have added below jquery code but it doesn't work $(document).ready(function(){ jQuery('.blue-shape').attr('data-actions',{event:'mouseenter', action:'jumptoslide', slide:'rs-16',delay:''}); }); .blue-shape is div class name where i want...
Dipesh
1

votes
1

answer
1.1k

Views

Get values (point, vector, array, etc.) from `xr.Dataset` in Xarray ? (Python 3)

I can't figure out how to actually pull the data out of a xr.Dataset object. I can't figure out how to access individual values. How can I pull the values (point values, vectors, arrays, etc.) out of the Datasets like I can with the DataArrays? np.random.seed(0) DA_data = xr.DataArray(np.random...
O.rka
1

votes
1

answer
2k

Views

AttributeError: 'module' object has no attribute '__version__'

I have installed LDA plibrary (using pip) I have a very simple test code (the next two rows) import lda print lda.datasets.load_reuters() But i keep getting the error AttributeError: 'module' object has no attribute 'datasets' in fact i get that each time i access any attribute/function under lda!
Samer Aamar
1

votes
2

answer
47

Views

Comparing data without disclosing it

Two companies A and B want to compare their respective customer bases and figure out the overlap. Obviously, they can't exchange their customer base. So they need to come up with a process to compare their listing without disclosing any information beside the intersection of both (which defies the w...
ibtarek
1

votes
2

answer
388

Views

Is map function on Datasets optimized for operations on one column?

For DataFrame, it is easy to generate a new column with some operation using a udf with df.withColumn('newCol', myUDF('someCol')). To do something like this in Dataset, I guess I would be using the map function: def map[U](func: (T) ⇒ U)(implicit arg0: Encoder[U]): Dataset[U] You have to pass the...
Janie
1

votes
2

answer
51

Views

Contingency tables from data.frame columns

I'm trying to create 4-way contingency table from my data set. My data set looks like this: a
Adela
1

votes
3

answer
3k

Views

How to check a value in dataset is empty or not?

I have a fileupload option in my project. It inclueds a query which returns a dataset. It works fine. But now i want to check whether the returning dataset is is empty or the same value i passed as a parameter to the query. Here is my back end code. .cs code if ((FileUpload1.HasFile))//&& (ext...
Mike
1

votes
1

answer
106

Views

C++ efficient way to find matches between two std::map

I have a data-set acquired with an RGB-D camera and a text file where for each image of the data-set, the timestamps and the filenames are stored. What I do is to parse this file and fill up two std::map, one for rgb images and the other for depth images. Now, since the timestamps don't intersect, I...
Federico Nardi
1

votes
1

answer
94

Views

How to extract top three values from 12 different columns and return the associated row name?

I am using a 43x12 data set that is built into R. The 43 rows are different people's names and the 12 columns are different stats. I need to get the people's names that scored in the top 3 for each stat. I can mostly do this except if two people have the exact same value for one stat I need to break...
Ashlee Berry
1

votes
2

answer
4k

Views

Spark: How can DataFrame be Dataset[Row] if DataFrame's have a schema

This article claims that a DataFrame in Spark is equivalent to a Dataset[Row], but this blog post shows that a DataFrame has a schema. Take the example in the blog post of converting an RDD to a DataFrame: if DataFrame were the same thing as Dataset[Row], then converting an RDD to a DataFrameshould...
man on laptop
1

votes
2

answer
780

Views

Apache Spark update a row in an RDD or Dataset based on another row

I'm trying to figure how I can update some rows based on another another row. For example, I have some data like Id | useraname | ratings | city -------------------------------- 1, philip, 2.0, montreal, ... 2, john, 4.0, montreal, ... 3, charles, 2.0, texas, ... I want to update the users in the sa...
1

votes
2

answer
426

Views

How can I import custom metrics from App Insights to Power BI?

I'm able to add an App Insights DataSet through the Services part of 'Content Pack Library'. The import of all of the data from App Insights APPEARS to work. However, none of the custom metrics I've added to App Insights show up in the available 'Fields' in the DataSet that was just imported. All...
TheDude
1

votes
2

answer
367

Views

adding datarow[] array to a datagridview using c# forms

I currently have the ff codes. My problem is how do I get the datarow[] array data and show it to a datagridview table. DataSet ds = new DataSet(); Data dt = ds.Tables['Tables']; string path = Application.StartupPath + '\\test.xml'; Int MdNum = 1; //assign xmlfile to data set ds.ReadXml(path); /...
user1998735
1

votes
1

answer
1.9k

Views

Spark 2.0 - Convert DataFrame to DataSet

I want to load my data and do some basic linear regression on it. So first, I need to use VectorAssembler to produce my features column. However, when I use assembler.transform(df), df is a DataFrame, and it expects a DataSet. I tried df.toDS, but it gives value toDS is not a member of org.apache.sp...
Béatrice Moissinac
1

votes
1

answer
533

Views

What generates DataSet.designer.vb files

I have continually run into problems trying to repair code from another developer, and one method I opted to go with was manually changing the generated dataset.designer.vb file to include overloaded database CRUD methods... After some searching, and poking around the code files, I still cannot figu...
William Tolliver
1

votes
2

answer
334

Views

How to fill the null value in dataframe to uuid?

There is a dataframe with null values in one column(not all being null), it need to fill the null value with uuid, is there a way? cala> val df = Seq(('stuff2',null,null), ('stuff2',null,Array('value1','value2')),('stuff3','stuff3',null)).toDF('field','field2','values') df: org.apache.spark.sql.Data...
Robin Wang
1

votes
3

answer
172

Views

Is DataSet or DataTable value type in c#?

I know DataSet and DataTable are classes thus they must be reference types. But whenever I pass them in any method, I have to return ds or dt to get them filled. Example: DataSet ds = new DataSet(); FillDataSet(ds); //This get data from database This will not fill my dataset but below one will. Da...
Aman Chauhan
1

votes
2

answer
409

Views

Q-How to fill a new column in data.frame based on row values by two conditions in R

I am trying to figure out how to generate a new column in R that accounts for whether a politician 'i' remains in the same party or defect for a given legislatures 'l'. These politicians and parties are recognized because of indexes. Here is an example of how my data originally looks like: ## examp...
1

votes
1

answer
1.2k

Views

How to extract link url from Excel cell

I have a c# webjob that downloads and then reads an Excel file. One of the columns contains links that I'd like to save in my database. I'm currently using ExcelDataReader to convert the Excel file to a DataSet and then looping through the rows to grab the data. After conversion the column in que...
Alec Menconi

View additional questions