Home > brede > brede_mat_elimstop.m

brede_mat_elimstop

PURPOSE ^

brede_mat_elimstop - Eliminate stop words

SYNOPSIS ^

function Mout = brede_mat_elimstop(Min, varargin)

DESCRIPTION ^

 brede_mat_elimstop   - Eliminate stop words

       V = brede_mat_elimstop(W)

       Input:    Min        'Mat' structure

       Property: Filename   [ {stop_english1.txt} | stop_medline.txt
                            | stop_meshcommon.txt | stop_pubmed_neg1 ]
                 Stopwords  Cell string of stop words.

       Output:   Mout       'Mat' structure

       Eliminate stopwords in a 'mat' structure. By default the
       stopwords are taken from the data/stop_english1.txt file
       unless the 'stopwords' are set. The input and output 'mat'
       structures should have the following format.  

         M.matrix        Matrix containing the data
         M.rows          Documents ('bib' structures)
         M.columns       Cell string
         M.description   Textual description
         M.type = 'mat'

       There are several stop word files found in the 'brede/data'
       directory. They all contain one word per line: 

         stop_english1.txt     - Contains 571 common english words
         stop_lobaranatomy.txt - Positive list with words from
                                 lobarAnatomy field
         stop_medline.txt      - Contains 243 words used in MEDLINE
         stop_meshcommon.txt   - Contains 298(?) words found in the
                                 MeSH PubMed/MEDLINE field, that are
                                 too general to be decriptive for
                                 brain mapping 
         stop_pubmed_neg1.txt  - Contains 2534(?) words found in
                                 PubMed/MEDLINE abstract field, that
                                 are too general to be descriptive
         stop_pubmed_pos1.txt  - A non-complete corresponding positive
                                 list with 941(?) words 

       Example: 
         f = fullfile(fileparts(which('brede')), 'xml', 'wobibs.xml');
         B = brede_read_xml(f, 'output', 'collapsesecond');
         M = brede_bib_bib2mat(B, 'type', 'abstract');
         M = brede_mat_elimsingle(M)
         M = brede_mat_elimstop(M, 'filename', 'stop_english1.txt')
         M = brede_mat_elimstop(M, 'filename', 'stop_medline.txt')
         M = brede_mat_elimstop(M, 'filename', 'stop_lobaranatomy.txt')
         M = brede_mat_elimstop(M, 'filename', 'stop_meshcommon.txt')
         M = brede_mat_elimstop(M, 'filename', 'stop_pubmed_neg1.txt')
         brede_ui_mat(M)
 
       See also BREDE, BREDE_MAT, BREDE_STR, BREDE_MAT_ELIMSINGLE,
                BREDE_BIB_BIB2MAT, BREDE_STR_STR2MAT.

 $Id: brede_mat_elimstop.m,v 1.14 2007/07/04 17:13:22 fn Exp $

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Fri 27-Nov-2009 18:11:22 by m2html © 2005