Access to Information in Brazil and Data Science: Descendent Hierarchy Classification on requests made to São Paulo municipality from 2012 to 2019
classificação hierárquica descendente em pedidos realizados à Prefeitura de São Paulo de 2012 a 2019
DOI:
https://doi.org/10.36428/revistadacgu.v14i26.544Keywords:
access to information, access to information requests, computer text analysis, descending hierarchical classification, topic classificationAbstract
This article sought to understand how Data Science, text mining and text classification technologies can contribute to a better, aggregated understanding of access to information requests. Research used data from access to information requests made to the Municipality of São Paulo (PMSP), from 2012 to 2019, available on the municipality's Open Data Portal, proposing presented main issues identification and classification. 39,369 request texts submitted to PMSP were gathered into a corpus and submitted to analysis through a Descending Hierarchical Classification (CHD). In this same corpus, five demographic variables were inserted for each request, which were submitted to a standard text pre-processing routine, selecting 31,946 for analysis by CHD (81,16%). By proposing a classification of texts as a methodology for analyzing textual data, a paradigm was reinforced that textual data do not belong only to the qualitative field. In addition, considering only nouns, excluding verbs and adverbs, and the most frequent adjectives being used as part of vocabulary expressions, allowed a request context optimization, allowing to classify the textual data in a more objective way, mitigating the researchers' bias. Seven classes resulted from the analysis through the Descending Hierarchical Classification: 1 - Neighborhoods and districts; 2 - Procedure and procedural documents; 3 - Public contracts; 4 - Urban mobility; 5 - Family: health, education and social assistance; 6 - Real estate; and 7 - Public tenders and positions (workers). We present main references found in access to information requests analysis, cases in Mexico, national studies, and in China, in the city of Beijing, and contributes to the understanding citizens requests in an aggregated way, allowing decision makers to a better understanding of society's demands, which may result in more focused public policies.
Downloads
References
Alves, M. S. D. (2012). Do sigilo ao acesso: Análise tópica da mudança de cultura. Revista do Tribunal de Contas do Estado de Minas Gerais, 85(esp.), 120-134. https://revista1.tce.mg.
gov.br/Content/Upload/Materia/1683.pdf
Angélico, F., & Teixeira, M. A. C. (2012). Acesso à informação e ação comunicativa: Novo trunfo para a gestão social. Desenvolvimento em Questão, 10(21), 7-27. https://doi.org/10.
/2237-6453.2012.21.7-27
Bagozzi, B. E., Berliner, D., & Almquist, Z. W. (2019). When does open government shut? Predicting government responses to citizen information requests. Regulation & Governance, 15(2), 280-297. https://doi.org/10.1111/rego.12282
Balakrishnan, V., & Lloyd-Yemoh, E. (2014). Stemming and lemmatization: A comparison of retrieval performances. Lecture Notes on Software Engineering, 2(3), 262-267. https://doi.org/10.7763/lnse.2014.v2.134
Banisar, D. (2006). Freedom of information around the world 2006: A global survey of access to government information laws. Privacy International. https://www.humanrightsinitiative.
org/programs/ai/rti/international/laws_papers/intl/global_foi_survey_2006.pdf
Berliner, D. Bagozzi, B. E, & Palmer-Rubin, B. (2018). What information do citizens want? Evidence from one millon information requests in Mexico. World Development, 109, 222-235. https://doi.org/10.1016/j.worlddev.2018.04.016
Berliner, D., Bagozzi, B. E., Palmer-Rubin, B., & Erlich, A. (2021). The political logic of government disclosure: Evidence from information requests in Mexico. The Journal of Politics, 83(1), 229-245. https://doi.org/10.1086/709148
Cao, L. (2017). Data science: A comprehensive overview. ACM Computing Surveys, 50(3), 1-42. https://doi.org/10.1145/3076253
Centre for Law and Democracy. (2017). Global Right to Information Rating. http://www.rti-rating.org/country-data
Constituição da República Federativa do Brasil de 1988. (2022). Presidência da República. http://www.planalto.gov.br/ccivil_03/constituicao/constituicao.htm
Controladoria-Geral da União. (2019). Escala Brasil Transparente 360º: Metodologia e critérios de avaliação. Controladoria-Geral da União. https://mbt.cgu.gov.br/static/Metodologia
%20EBT.pdf
Convenção das Nações Unidas contra a Corrupção. (2003). Escritório das Nações Unidas contra Drogas e Crime. https://www.unodc.org/documents/lpo-brazil//Topics_corruption/
Publicacoes/2007_UNCAC_Port.pdf
Declaração dos Direitos do Homem e do Cidadão. (1789). Assembleia Nacional. https://br.amba
france.org/A-Declaracao-dos-Direitos-do-Homem-e-do-Cidadao
Declaração Universal dos Direitos Humanos. (1948). Organização das Nações Unidas. https://www.unicef.org/brazil/declaracao-universal-dos-direitos-humanos
Decreto nº 53.623, de 12 de dezembro de 2012. (2012, 13 de dezembro). Regulamenta a Lei Federal n° 12.527, de 18 de novembro de 2011, no âmbito do Poder Executivo, estabelecendo procedimentos e outras providências correlatas para garantir o direito de acesso à informação, conforme especifica. Prefeitura de São Paulo. http://legislacao.
prefeitura.sp.gov.br/leis/decreto-53623-de-12-de-dezembro-de-2012
Duarte, J., & Theorga, A. B. (2012). O processo de implantação da Lei de Acesso à Informação em órgãos do Poder Executivo federal. Comunicação & Informação, 15(2), 66-79. https://doi.org/10.5216/c&i.v15i2.24568
Flores, A. M., Pavan, M. C., & Paraboni, I. (2022). User profiling and satisfaction inference in public information access services. Journal of Intelligent Information Systems, 58(1), 67-89. https://doi.org/10.1007/s10844-021-00661-w
Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-574. https://doi.org/10.1257/jel.20181020
Hollibaugh, G. E., Jr. (2019). The use of text as data methods in public administration: A review and an application to agency priorities. Journal of Public Administration Research And Theory, 29(3), 474-490. https://doi.org/10.1093/jopart/muy045
Instituto Brasileiro de Geografia e Estatística. (2010). Censo 2010: São Paulo: panorama. Portal do Governo Brasileiro. https://cidades.ibge.gov.br/brasil/sp/sao-paulo/panorama
Kelleher, J. D., & Tierney, B. (2018). Data science. MIT Press.
Kotu, V., & Deshpande, B. (2018). Data science: Concepts and practice (2nd ed.). Morgan Kaufmann.
Lane, J., Gimeno, E., Levitskaya, E., Zhang, Z., & Zigoni, A. (2022). Data inventories for the modern age? Using data science to open government data. Harvard Data Science Review, 4.2, 1-45. https://doi.org/10.1162/99608f92.8a3f2336
Lei Modelo Interamericana sobre o Acesso à Informação Pública. (2010). Organização dos Estados Americanos. http://www.oas.org/dil/AG-RES_2607-2010_por.pdf
Lei nº 12.527, de 18 de novembro de 2011. (2011, 18 de novembro). Regula o acesso a informações previsto no inciso XXXIII do art. 5º, no inciso II do § 3º do art. 37 e no § 2º do art. 216 da Constituição Federal; altera a Lei nº 8.112, de 11 de dezembro
de 1990; revoga a Lei nº 11.111, de 5 de maio de 2005, e dispositivos da Lei nº 8.159, de 8 de janeiro de 1991; e dá outras providências. Presidência da República. http://www.planalto.gov.br/ccivil_03/_ato2011-2014/2011/lei/l12527.htm
Martin, F., & Johnson, M. (2015, 8-9 December). More efficient topic modelling through a noun only approach. Proceedings of the Australasian Language Technology Association Workshop 2015, 13, 111-115. https://aclanthology.org/U15-1013/
Mendes, A. M., Tonin, F. S., Buzzi, M. F., Pontarolo, R., & Fernandez-Llimos, F. (2019). Mapping pharmacy journals: A lexicographic analysis. Research in Social & Administrative Pharmacy, 15(12), 1464-1471. https://doi.org/10.1016/j.sapharm.2019.01.011
Painel Resolveu?. (2022). Controladoria-Geral da União. http://paineis.cgu.gov.br/resolveu/
index.htm
Rydholm, L. (2013). China and the World’s First Freedom of Information Act: The Swedish Freedom of the Press Act of 1766. Javnost-The Public, 20(4), 45-63. https://doi.org/10.1080/13183222.2013.11009127
Sá, M. I. F. (2016). Lei de acesso à informação no Brasil e em Portugal: Uma reflexão sobre transparência, dados abertos e analfabetismo funcional. In Luísa, Neto, e F. Ribeiro (Orgs.), Atas do IV Colóquio Luso-Brasileiro Direito e Informação (pp. 142-161). Faculdade de Direito da Universidade do Porto. https://ocs.letras.up.pt/index.php/DirInf/
/index
Sarica, S., & Luo, J. (2021). Stopwords in technical language processing. Plos One, 16(8), e0254937. https://doi.org/10.1371/journal.pone.0254937
Savage, A., & Hyde, R. (2014). Using freedom of information requests to facilitate research. International Journal of Social Research Methodology, 17(3), 303-317. https://doi.org/10.
/13645579.2012.742280
Schofield, A., Magnusson, M., & Mimno, D. (2017, 3-7 April). Pulling out the stops: Rethinking stopword removal for topic models. Proceedings of the 15th Conference of
the European Chapter of the Association for Computational Linguistics, 2, 432-436. https://aclanthology.org/E17-2069/
Soares, G. F. (2020). Ciência de dados aplicada à auditoria interna. Revista da CGU, 12(22), 196-208. https://doi.org/10.36428/revistadacgu.v12i22.195
Sousa, Y. S. O. (2021). O uso do software Iramuteq: Fundamentos de lexicometria para pesquisas qualitativas. Estudos e Pesquisas em Psicologia, 21(4), 1541-1560. https://doi.
org/10.12957/epp.2021.64034
Walby, K., & Larsen, M. (2012). Access to information and freedom of information requests: Neglected means of data production in the social sciences. Qualitative Inquiry, 18(1), 31-42. https://doi.org/10.1177/1077800411427844
Walby, K., & Luscombe, A. (2017). Criteria for quality in qualitative research and use of freedom of information requests in the social sciences. Qualitative Research, 17(5), 537-553. https://doi.org/10.1177/1468794116679726
Wang, Z., & Zhong, Y. (2020). What were residents’ petitions in Beijing-based on text mining. Journal of Urban Management, 9(2), 228-237. https://doi.org/10.1016/j.jum.
11.006
Downloads
Published
Issue
Section
License
Copyright (c) 2022 Revista da CGU

This work is licensed under a Creative Commons Attribution 4.0 International License.
The Revista da CGU follows the Creative Commons Attribution 4.0 International License (CC BY), which allows the use and sharing of published works with mandatory indication of authors and sources. Contents published until 2019 have generic permission for use and sharing with mandatory indication of authorship and source.
We highlight some essential and non-exhaustive points related:
- The submission of the proposal implies a commitment not to submit it to another journal and authorizes if approved, its publication.
- The submission of the proposal also implies that the author(s) agrees with the publication, without resulting in remuneration, reimbursement, or compensation of any kind.
- The published texts are the responsibility of the authors and do not necessarily represent the opinion of the journal.
- Responsibility for any plagiarism is the responsibility of the author(s).
- The person responsible for the submission declares, under the penalties of the Law, that the information on the authorship of the work is complete and correct.
Also highlighted are the items related to our Editorial Policies, in particular on the Focus and Scope, Publication Ethics, Peer Review Process, and Open Access Policy.
