KAMIL WAIS PDF

This was the first time a plenary took place in Asia. Now, I can share with you my experience. Firstly, RDA is a truly global international and intercontinental organization with mission to build the social and technical bridges that enable open sharing of research data. The global character can be seen on every level.

Author:Samukora Mesida
Country:Italy
Language:English (Spanish)
Genre:History
Published (Last):6 February 2014
Pages:488
PDF File Size:6.15 Mb
ePub File Size:14.27 Mb
ISBN:298-6-89774-649-9
Downloads:79554
Price:Free* [*Free Regsitration Required]
Uploader:Muzilkree



To classify the authors and people mentioned in titles of articles based on their gender, we used genderizeR package Wais a of R R Core Team The package guesses the gender of a person based on the first name and the data gathered in the genderize. Created in August , the database has been regularly updated since, by the continuous scanning of public profiles of social network users. In April , the genderize. Glossary Authorship—the unique combination of the title of an article and the name of one of the authors note that the same author can publish more than one article, so the number of authorships will be greater than the number of authors.

Unisex first name—a first name that can be used both by men and women. Gender database—a database used for gender classification; in our study, we used genderize. Probability—given a first name, a probability that the person with this first name is men or women, depending on the context. If the probability is 0. Count—a number of people in the gender database with the same first name.

Gender classification We used the methodology suggested in Wais b to guess the gender of i people mentioned in titles of biographical articles and ii authors of these articles. The algorithm, available in the genderizeR package Wais b , automatically parses all title words, checks in the genderize. In the third step above, the algorithm takes into account that some first names are valid for both men and women, and so classifying such names is always imprecise. Using the gender data from the database, we can estimate this uncertainty: given a first name, the probability of being a woman is estimated as the share of people with this first name who declared themselves as women.

Validation of gender classifications Validation datasets We validated the algorithm with a random sample of unique biographical articles.

This way, we coded the gender of persons in the titles as Open in a separate window Similarly, to validate how precisely the algorithm classified the gender of authors, we randomly sampled biographical articles and extracted author names.

We coded the gender of authorships as Open in a separate window Training the algorithm From the genderize. We have to decide whether we wish to work only with names for which this probability is close to 1 or we accept also names for which this probability is closer to 0.

Thus, to train the algorithm for classifying gender, we should check different threshold values of this probability and choose the best one. The algorithm will not use first names with probabilities below this threshold; this way, we can decrease the uncertainty of our classifications at the cost of ignoring unisex first names. We should also be cautious when using rare unisex first names. To decide which names should be included in the algorithm and which ignored, we should test different threshold values for counts of how many times a first name was recorded in the gender database; the algorithm will use only those first names which occurred more often than the threshold.

So, we looked for the optimum values of these two parameters: probability that a first name represents a particular gender and count of how many times a first name was recorded in the database with gender data Wais b.

Based on a preliminary, exploratory analysis, we have decided that the optimum probability should be between 0. Note that the algorithm should be independently trained for the two datasets: titles and authorships. For both datasets, we checked all combinations of i probability between 0. The best combination is that which leads to the highest accuracy of gender classification, that is, for which the algorithm would match the manually coded data in the highest number of cases.

For the validation dataset of titles, the algorithm worked best with the probability parameter set to 0. Using these values, we obtained a relatively small overall classification error rate 8. The gender bias error rate in automatic gender classification was also low 4. Since we estimated the overall classification error rate 8.

Thus to get a more realistic indicator of classification error rate, we also estimated a more robust bootstrapped error rate 8. For the validation dataset of authorships, the algorithm worked best with the probability parameter set to 0. Using these values, we obtained small overall classification error rate 6.

Categories of biographical articles Terminology Web of Science defines biographical items and items about an individual which we join to a document type of biographical articles as, generally put, articles focused on life of individuals, obituaries, tributes, and commemorations as well as tributes to such people.

The latter group represents articles that are not considered biographical in the traditional meaning; these can be, for example, transcripts of lectures or review articles on a given topic, whose only relation to an individual is dedication of the article. Individual biographical articles, thus, can differ quite a lot. Thus, we conducted an in-depth analysis of a sample of biographical articles, to find out whether they can be classified into distinct categories. After a preliminary analysis, we divided the articles into those about alive and dead people.

JAMES MERRILL THE CHANGING LIGHT AT SANDOVER PDF

Biographical articles in scientific literature: analysis of articles indexed in Web of Science

The accuracy of prediction could be control by two parameters: counts of a first name in the database and probability of prediction. These methods have applications from bibliometric studies to customizing commercial offers for web users. Analysis of gender disparities in science based on such methods are published in the most prestigious journals, although they could be improved by choosing the most suited prediction method with optimal parameters and performing validation studies using the best data source for a given purpose. There is also a need to monitor and report how well a given prediction method works in comparison to others.

ALASTAIR REYNOLDS THE PREFECT PDF

KAMIL WAIS PDF

To classify the authors and people mentioned in titles of articles based on their gender, we used genderizeR package Wais a of R R Core Team The package guesses the gender of a person based on the first name and the data gathered in the genderize. Created in August , the database has been regularly updated since, by the continuous scanning of public profiles of social network users. In April , the genderize.

GARTNER SIEM MAGIC QUADRANT 2014 PDF

Kamil Wais

To classify the authors and people mentioned in titles of articles based on their gender, we used genderizeR package Wais a of R R Core Team The package guesses the gender of a person based on the first name and the data gathered in the genderize. Created in August , the database has been regularly updated since, by the continuous scanning of public profiles of social network users. In April , the genderize. Glossary Authorship—the unique combination of the title of an article and the name of one of the authors note that the same author can publish more than one article, so the number of authorships will be greater than the number of authors. Unisex first name—a first name that can be used both by men and women. Gender database—a database used for gender classification; in our study, we used genderize.

MAGE THE ASCENSION GUIDE TO THE TECHNOCRACY PDF

R software integrated with OpenPoland

Iraq police thwart juvenile suicide bombing in Kirkuk Aug. Iranian security forces clash with students at bus crash protest. Mimosa factory owner released after appeal rejected. Democratic Senator Warren takes step toward US presidential bid.

Related Articles