En este blog ya hemos hablado de los grandes modelos de lenguaje (LLMs por sus siglas en inglés) como ChatGPT. Esa entrada fue en 2023 y desde entonces hemos presenciado como su uso se extiende en todos los ámbitos, desde el móvil, los colegios o las empresas.
En ciencia también se han extendido mucho. Por ejemplo en el 1er congreso de la SEBiBC hubo una sesión dedicada a la IA donde se trataron. En mi caso, tengo instalado en mi máquina deepseek-r1:14b y qwen2.5-coder:latest sobre ollama y VSCode, y me ayudan a escribir código. Pero para qué usos es lícito usarlos en ciencia?
La sociedad internacional de biología computacional (ISCB), con sede en EEUU, ha publicado a primeros de abril una guía breve, que dicen actualizarán según sea necesario. Al paso que va todo no creo que tarden mucho...
Como la ISCB tiene su sede en EEUU menciona entidades como el NIH; en cualquier caso pueden aplicarse con pocos cambios a otros ámbitos. La copio y pego aquí en inglés:
Confidentiality
When using commercial LLMs. such as ChatGPT or Gemini, data may be reused and thus it is important that confidential or personal information is not shared. This is particularly important with respect to peer review. The NIH currently forbids the use of LLMs in peer review for this reason (see NIH policy). Many Institutions have also developed further policies that may apply. Below we list the acceptable and unacceptable uses of LLMs and related technologies. Note that acceptable use cases only apply where confidentiality is not an issue.Unacceptable Uses
It is not acceptable to use LLMs or related technologies to draft paper sections. In essence, papers MUST be written by humans.
It is not acceptable to use LLMs or related technologies to carry out reviewing activities, such as scientific peer reviews and promotion and tenure reviews. Firstly, these are an important part of the scientific process and they require scientific judgement. Secondly, review processes are in general confidential and should not be shared with third parties, including commercial LLM providers.
LLMs cannot be listed as authors as they do not fulfill the requirements of authorship as laid out in the ICMJE guidelines.
Acceptable Uses
As an algorithmic technique for research study in your research e.g. LLMs for protein structure prediction.
As an aid to correct written text (spell checkers, grammar checkers).
As an aid to language translation, however, the human is responsible for the accuracy of the final text.
As an evaluation technique (to assist in finding inconsistencies or other anomalies).
It is permissible to include LLM generated text snippets as examples in research papers where appropriate, but these MUST be clearly labeled and their use explained.
Assist in code writing, however, the human is responsible for the code.
Create documentation for code, however, the human is responsible for the correct documentation.
To discover background information on a topic, subject to verification from trusted sources.
Fuente: https://www.iscb.org/iscb-policy-statements/iscb-policy-for-acceptable-use-of-large-language-models
Hasta pronto