Fix malformed journal links with double http:// prefix by Copilot · Pull Request #1357 · scieloorg/core

Copilot · 2026-02-24T13:24:52Z

collection.domain is a URLField storing full URLs (e.g., http://www.scielo.org.pe), but multiple code paths were blindly prepending http://, producing broken URLs like http://http//www.scielo.org.pe/scielo.php?....

O que esse PR faz?

Remove os prefixos http:// duplicados em todos os locais onde collection.domain é usado para construir URLs do scielo.php. Antes/depois:

# Before
f"http://{domain}/scielo.php?script=sci_serial&pid={issn_scielo}&lng=en"

# After
f"{domain.rstrip('/')}/scielo.php?script=sci_serial&pid={issn_scielo}&lng=en"

A mesma correção foi aplicada nos dois templates HTML (removido http:// antes de {{ journal.collection__domain }}).

Também foram adicionados testes unitários para cobrir os casos corrigidos.

Onde a revisão poderia começar?

core/home/views.py → _get_scielo_journals_data(), depois os dois templates em core/templates/home/include/list_journal_page/, e os novos testes em core/home/tests.py e journal/tests.py.

Como este poderia ser testado manualmente?

Navegar até a página de listagem alfabética de periódicos (e.g., /pt-br/listar-por-ordem-alfabética/)
Buscar por qualquer termo e inspecionar os links gerados
Confirmar que as URLs estão bem formadas (e.g., http://www.scielo.org.pe/scielo.php?...) sem duplo prefixo http://

Algum cenário de contexto que queira dar?

A causa raiz é que domain é um URLField do Django, que persiste a URL completa incluindo o esquema. Todos os quatro locais afetados (views.py, journals_table.html, journals_table_by_publisher.html, classic_website.py) assumiam que o campo continha apenas o hostname.

Screenshots

N/A — problema de renderização de URL, visível via inspetor do navegador ou ao passar o mouse sobre os links na lista de periódicos.

Quais são tickets relevantes?

Formação do link para os periódicos a partir de scielo.org está incorreta

Referências

Django URLField armazena a URL completa incluindo o esquema: https://docs.djangoproject.com/en/stable/ref/models/fields/#urlfield

Original prompt

This section details on the original issue you should resolve

<issue_title>Formação do link para os periódicos a partir de scielo.org está incorreta</issue_title>
<issue_description>### Descrição do problema

Se clicar aqui funciona
http://www.scielo.org.pe/scielo.php?script=sci_serial&pid=2709-3689&lng=es&nrm=iso

Se clicar aqui não funciona
https://www.scielo.org/pt-br/listar-por-ordem-alfab%C3%A9tica/?search_term=kaw

Observe que o link que abre está incorreto:

http://http//www.scielo.org.pe/scielo.php?script=sci_serial&pid=2709-3689&lng=en</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Formação do link para os periódicos a partir de scielo.org está incorreta #1356

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

The collection.domain field is a URLField storing full URLs with protocol (e.g., http://www.scielo.org.pe). Code was prepending 'http://' again, creating malformed URLs like 'http://http://www.scielo.org.pe/scielo.php?...'. Fixed in: - core/home/views.py: use domain directly without prepending http:// - journals_table.html: remove hardcoded http:// from href - journals_table_by_publisher.html: remove hardcoded http:// from href - journal/sources/classic_website.py: use collection directly without prepending http:// Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>

robertatakenaka · 2026-02-24T13:45:06Z

@copilot adicione testes unitários

Copilot

Pull request overview

This PR aims to fix malformed SciELO journal links caused by blindly prepending http:// to collection.domain, which can already include a scheme, resulting in URLs like http://http//....

Changes:

Updated URL construction in core/home/views.py to stop hardcoding http:// and strip trailing /.
Updated classic website XML fetch URLs to stop hardcoding http:// and strip trailing /.
Updated two journal list templates to stop hardcoding http:// in generated links.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`core/home/views.py`	Adjusts generated `scielo_url` to avoid double-scheme URLs.
`journal/sources/classic_website.py`	Adjusts scielo.php request URLs to avoid double-scheme URLs.
`core/templates/home/include/list_journal_page/journals_table.html`	Removes hardcoded `http://` prefix in journal link href.
`core/templates/home/include/list_journal_page/journals_table_by_publisher.html`	Removes hardcoded `http://` prefix in journal link href.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-24T13:46:25Z

core/templates/home/include/list_journal_page/journals_table.html

                                {% endif %}
-                                <a target="_blank" href="http://{{ journal.collection__domain }}/scielo.php?script=sci_serial&pid={{journal.issn_scielo}}&lng=en">{{ journal.journal__title }}</a><span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{journal.collection__main_name}}</span>
+                                <a target="_blank" href="{{ journal.collection__domain }}/scielo.php?script=sci_serial&pid={{journal.issn_scielo}}&lng=en">{{ journal.journal__title }}</a><span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{journal.collection__main_name}}</span>
                            </th>


This assumes journal.collection__domain already includes a scheme. In this codebase it can be stored as host-only (e.g. www.scielo.br in tests/fixtures), in which case this href becomes a relative URL and breaks. Consider adding a conditional to prefix http:// only when the domain lacks ://, or (preferably) pass a pre-normalized scielo_url from the view/context.

Copilot · 2026-02-24T13:46:25Z

core/templates/home/include/list_journal_page/journals_table_by_publisher.html

                                            <span class="material-icons align-middle fs-5" style="color: #c63800;">fiber_manual_record</span>
                                        {% endif %}
-                                        <a target="_blank" href="http://{{ scielojournal.collection.domain }}/scielo.php?script=sci_serial&pid={{ scielojournal.issn_scielo }}&lng=en">{{ scielojournal.journal.title }}</a><span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{ scielojournal.collection.main_name }}</span> 
+                                        <a target="_blank" href="{{ scielojournal.collection.domain }}/scielo.php?script=sci_serial&pid={{ scielojournal.issn_scielo }}&lng=en">{{ scielojournal.journal.title }}</a><span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{ scielojournal.collection.main_name }}</span> 


Same issue here: scielojournal.collection.domain may be stored without http(s):// (host-only), so this href can become relative/invalid. Add logic to prepend a scheme only when missing (e.g. check for ://), or build and provide a normalized URL from Python to keep templates simple and consistent.

Suggested change

<a target="_blank" href="{{ scielojournal.collection.domain }}/scielo.php?script=sci_serial&pid={{ scielojournal.issn_scielo }}&lng=en">{{ scielojournal.journal.title }}</a><span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{ scielojournal.collection.main_name }}</span>

{% if "://" in scielojournal.collection.domain %}

<a target="_blank" href="{{ scielojournal.collection.domain }}/scielo.php?script=sci_serial&pid={{ scielojournal.issn_scielo }}&lng=en">{{ scielojournal.journal.title }}</a>

{% else %}

<a target="_blank" href="https://{{ scielojournal.collection.domain }}/scielo.php?script=sci_serial&pid={{ scielojournal.issn_scielo }}&lng=en">{{ scielojournal.journal.title }}</a>

{% endif %}

<span style="color: #888; font-size: 0.9em; opacity: 0.7; padding: 4px 8px;">{{ scielojournal.collection.main_name }}</span>

Copilot · 2026-02-24T13:46:25Z

journal/sources/classic_website.py

        collections = requests.get(
-            f"http://{collection}/scielo.php?script=sci_alphabetic&lng=es&nrm=iso&debug=xml",
+            f"{collection.rstrip('/')}/scielo.php?script=sci_alphabetic&lng=es&nrm=iso&debug=xml",
            timeout=10,


collection can be stored without a URL scheme (e.g. tests/fixtures use www.scielo.br). With the new f-string, requests.get() will receive a URL like www.scielo.br/scielo.php?... (missing scheme) and fail. Consider normalizing before building the URL: strip trailing / and prepend a scheme only when one is missing (e.g., if "://" not in collection: collection = "http://" + collection). Also guard against None values (e.g., collection = (collection or "")).

Copilot · 2026-02-24T13:46:26Z

journal/sources/classic_website.py

        official_journal = requests.get(
-            f"http://{collection}/scielo.php?script=sci_serial&pid={issn}&lng=es&nrm=iso&debug=xml",
+            f"{collection.rstrip('/')}/scielo.php?script=sci_serial&pid={issn}&lng=es&nrm=iso&debug=xml",
            timeout=10,


Same issue as above: if collection lacks a scheme (common in this codebase, e.g. Collection(domain="www.scielo.br") in tests), this will generate a URL without http(s):// and requests.get() will fail. Normalize collection once (strip trailing /, add scheme if missing) before constructing the request URL, and handle None safely.

Copilot · 2026-02-24T13:46:26Z

core/home/views.py

            scielo_url = (
-                f"http://{domain}/scielo.php?script=sci_serial&pid={issn_scielo}&lng=en"
+                f"{domain.rstrip('/')}/scielo.php?script=sci_serial&pid={issn_scielo}&lng=en"
            )


collection__domain is not consistently stored with a URL scheme in this repo (tests/fixtures use values like www.scielo.br). After this change, if domain has no scheme the generated URL becomes www.scielo.br/scielo.php?... (relative/invalid in many contexts). Also, if domain is None, calling .rstrip('/') will raise. Suggest normalizing safely: domain = (domain or "").rstrip("/") and prepend a scheme only when missing (e.g. if domain and "://" not in domain: domain = "http://" + domain).

- core/home/tests.py: Tests for _get_scielo_journals_data() verifying that domain URLs with http:// prefix are not double-prefixed, that trailing slashes are stripped, and that https:// domains work - journal/tests.py: Tests for classic_website.get_issn() and get_journal_xml() verifying URLs are built correctly when the domain already contains http:// or https:// Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>

Copilot · 2026-02-24T13:47:46Z

@copilot adicione testes unitários

Added unit tests in commit 1da37de:

core/home/tests.py (TestGetScieloJournalsData): 4 tests for _get_scielo_journals_data() — verifies no double http:// prefix, correct URL format, trailing slash stripping, and https:// domains.
journal/tests.py (TestClassicWebsiteGetIssn, TestClassicWebsiteGetJournalXml): 6 tests for get_issn() and get_journal_xml() — verifies the HTTP request URLs are well-formed when the collection domain already includes the protocol (http:// or https://).

Initial plan

df6f10c

Copilot AI assigned Copilot and robertatakenaka Feb 24, 2026

Copilot started work on behalf of robertatakenaka February 24, 2026 13:25 View session

Copilot AI changed the title ~~[WIP] Fix incorrect link generation for scielo.org journals~~ Fix malformed journal links with double http:// prefix Feb 24, 2026

Copilot AI requested a review from robertatakenaka February 24, 2026 13:30

Copilot finished work on behalf of robertatakenaka February 24, 2026 13:31

robertatakenaka marked this pull request as ready for review February 24, 2026 13:43

Copilot AI review requested due to automatic review settings February 24, 2026 13:43

Copilot started reviewing on behalf of robertatakenaka February 24, 2026 13:43 View session

Copilot started work on behalf of robertatakenaka February 24, 2026 13:45 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

Copilot finished work on behalf of robertatakenaka February 24, 2026 13:48

robertatakenaka merged commit 6d5da52 into main Feb 24, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix malformed journal links with double http:// prefix#1357

Fix malformed journal links with double http:// prefix#1357
robertatakenaka merged 3 commits intomainfrom
copilot/fix-scie-lo-generate-link

Copilot AI commented Feb 24, 2026 •

edited

Loading

Uh oh!

robertatakenaka commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

O que esse PR faz?

Onde a revisão poderia começar?

Como este poderia ser testado manualmente?

Algum cenário de contexto que queira dar?

Screenshots

Quais são tickets relevantes?

Referências

Comments on the Issue (you are @copilot in this section)

Uh oh!

robertatakenaka commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 24, 2026 •

edited

Loading