Main Content

Select Subsets of Data

Why Select Subsets of Data?

You can use data selection to create independent data sets for estimation and validation.

You can also use data selection as a way to clean the data and exclude parts with noisy or missing information. For example, when your data contains missing values, outliers, level changes, and disturbances, you can select one or more portions of the data that are suitable for identification and exclude the rest.

如果你只有一个数据集和你想estimate linear models, you should split the data into two portions to create two independent data sets for estimation and validation, respectively. Splitting the data is selecting parts of the data set and saving each part independently.

You can merge several data segments into a single multiexperiment data set and identify an average model. For more information, seeCreate Data Sets from a Subset of Signal ChannelsorRepresenting Time- and Frequency-Domain Data Using iddata Objects.

Note

Subsets of the data set must contain enough samples to adequately represent the system, and the inputs must provide suitable excitation to the system.

Selecting portions of frequency-domain data is equivalent to filtering the data. For more information about filtering, seeFiltering Data.

Extract Subsets of Data Using the App

Ways to Select Data in the App

You can use System Identification app to select ranges of data on a time-domain or frequency-domain plot. Selecting data in the frequency domain is equivalent to passband-filtering the data.

After you select portions of the data, you can specify to use one data segment for estimating models and use the other data segment for validating models. For more information, see指定的评估和验证应用程序中的数据.

Note

Selecting<--Preprocess>Quick startperforms the following actions simultaneously:

  • Remove the mean value from each channel.

  • Split the data into two parts.

  • Specify the first part as estimation data (orWorking Data).

  • Specify the second part asValidation Data.

Selecting a Range for Time-Domain Data

You can select a range of data values on a time plot and save it as a new data set in the System Identification app.

Note

Selecting data does not extract experiments from a data set containing multiple experiments. For more information about multiexperiment data, seeCreate Multiexperiment Data Sets in the App.

To extract a subset of time-domain data and save it as a new data set:

  1. Import time-domain data into the System Identification app, as described inCreate Data Sets from a Subset of Signal Channels.

  2. Drag the data set you want to subset to theWorking Dataarea.

  3. If your data contains multiple I/O channels, in theChannelmenu, select the channel pair you want to view. The upper plot corresponds to the input signal, and the lower plot corresponds to the output signal.

    Although you view only one I/O channel pair at a time, your data selection is applied to all channels in this data set.

  4. Select the data of interest in either of the following ways:

    • Graphically — Draw a rectangle on either the input-signal or the output-signal plot with the mouse to select the desired time interval. Your selection appears on both plots regardless of the plot on which you draw the rectangle. TheTime spanandSamplesfields are updated to match the selected region.

    • By specifying theTime span— Edit the beginning and the end times in seconds. TheSamplesfield is updated to match the selected region. For example:

      28.5 56.8

    • By specifying theSamplesrange — Edit the beginning and the end indices of the sample range. TheTime spanfield is updated to match the selected region. For example:

      342 654

    Note

    To clear your selection, clickRevert.

  5. In theData namefield, enter the name of the data set containing the selected data.

  6. ClickInsert. This action saves the selection as a new data set and adds it to the Data Board.

  7. To select another range, repeat steps 4 to 6.

Selecting a Range of Frequency-Domain Data

Selecting a range of values in frequency domain is equivalent to filtering the data. For more information about data filtering, seeFiltering Frequency-Domain or Frequency-Response Data in the App.

Extract Subsets of Data at the Command Line

Selecting ranges of data values is equivalent to subreferencing the data.

For more information about subreferencing time-domain and frequency-domain data, seeSelect Data Channels, I/O Data and Experiments in iddata Objects.

For more information about subreferencing frequency-response data, seeSelect I/O Channels and Data in idfrd Objects.

Related Topics