Essa é uma revisão anterior do documento!


Artigo

Submissões

  1. 1a submissão
    1. Título: Estimating abundance at age with Bayesian geostatistics and compositional data analysis
    2. Autores:
    3. Periódico:
  2. 2a submissão
    1. Pedido de reconsideração e reavaliação feito por Ernesto ao CJFAS e aceito em 10/04/2008
    2. Título: Modelling spatio-temporal abundance at age with Bayesian geostatistics and compositional data analysis
    3. Autores:
    4. Periódico:
    5. Versões do texto:
  1. 3a submissão

In this current revision of the paper we have made a major attempt to present a more objective MS

 compared to the previous version, focusing on the essence of hte propose methodology and 
 removing aspects is the previous work which were diverting the text from its mais goal without 
 substantially improving the findings or even bringing unnecessary complications if not biasing results.

Correspondências com periódico(s)

Resposta (negativa) à 1a submissão para CJFAS

Dear Dr. Jardim,

20383 - Estimating abundance at age with Bayesian geostatistics and compositional data analysis

We very much regret that we cannot consider this manuscript for publication in CJFAS. The problem concerns a conflict with our editorial policy, and does not reflect upon the quality of your science. Upon receipt of your submission, we sought advice from an Associate Editor. The consensus is that, while we recognise the importance of the study, it presents a heavier emphasis on the statistics and modelling than we typically consider for CJFAS, and therefore we believe it would be better suited to a journal such as Biometrics.

It has become evident that we must restrict the scope and size of the Journal in some way, while trying to preserve its general multi-disciplinary character. The most equitable solution, we believe, is to refer to other more specialised outlets reports that are basically descriptive, or whose novelty relates only to new data or to the particular situation studied, or methods papers that apply standard techniques without breaking new methodological ground. In other words, we attempt to select work that leads to a conceptual advance or a refined understanding of general processes or phenomena (see our policy statement inside the front cover of each Journal issue and an editorial in CJFAS 55(1): 1-2).

We trust you will appreciate that an initial editorial decision avoids the delays of the peer review process, and will allow you to re-submit your manuscript elsewhere. We thank you for your interest in CJFAS and wish you well in finding another outlet for your work.

Yours sincerely,

Don Jackson Editor

pedido de reconsideração ao CJFAS

Dear Don Jackson,

Recently the CJFAS published several papers with a strong statistical emphasis. Among others you have published in 2008:

Can. J. Fish. Aquat. Sci. 65(1): 17?26 (2008) | doi:10.1139/F07-141 | © 2008 NRC Canada A statistical modeling method for estimating mortality and abundance of spawning salmon from a time series of counts R. Glenn Szerlong and David E. Rundio

Can. J. Fish. Aquat. Sci. 65(1): 117?133 (2008) | doi:10.1139/F07-153 | © 2008 NRC Canada Hierarchical Bayesian modelling with habitat and time covariates for estimating riverine fish population size by successive removal method Etienne Rivot, Etienne Prévost, Anne Cuzol, Jean-Luc Baglinière, and Eric Parent

Can. J. Fish. Aquat. Sci. 65(2): 176?197 (2008) | doi:10.1139/F07-138 | © 2008 NRC Canada Estimating abundance of spatially aggregated populations: comparing adaptive sampling with other survey designs Kathryn L. Mier and Susan J. Picquelle

All this papers have a stronger statistical burden than the paper we submitted and you rejected base on “a heavier emphasis on the statistics and modelling than we typically consider for CJFAS” and, as you said, without considering the quality of our science.

My opinion is that this paper is good and tackles an important issue for marine science, the estimation of the population structure from survey data, and I'd like to have it evaluated by the quality of the science. We revised the manuscript and added a flowchart with a visual representation of the algorithm to make it easier to understand by those not very familiar with statistics.

I really believe CJFAS is the right journal for this paper exactly because you manage, along the years, to balance between theoretical papers presenting important advances to science and holistic views that make those advances useful.

Once more I'd like to ask you to reconsider and accept this manuscript to be revised by my peers.

Sorry to bother you again, but looking at the recent CJFAS numbers made me reconsider this submission.

Best regards

EJ

resposta editor CJFAS ao pedido de reconsideração

Holly Foster wrote: Dear Dr. Jardim,

Thank you again for your interest in CJFAS for this work. If you have revised your work to make it more accessible to those less familiar with the statistics, we would be pleased to consider it again. You may send the revised version to me by email in Word or PDF format for an informal evaluation of its suitability for CJFAS, or you may resubmit through Osprey as a new submission for a formal evaluation. I look forward to hearing from you soon.

Best regards,

Holly Foster

Envio de 2a submissão ao CJFAS

Dear Holly Foster,

Thanks for the opportunity of resubmitting our work for an informal evaluation. We did a deeper revision than we foreseen initially to improve its readability which took longer than we expected. You'll notice we also changed the title.

Best regards.

EJ

Comments by Bill Venables

The work itself looks pretty good to me and if you wish I'll make a few suggestions about technical additions as well. Spatial notions and ideas are nor foreign to the stock assessment community, and nor are Bayesian ideas now, but *few if any* handle it well. That's the real gap. The idea of a joint spatial-age composition parametric model seems to me likely to be exportable to many other stock assessment projects, so it is important you get this work published and widely read.

Comentários em seminários

Há dois comentários relevantes feitos em apresentações no IPIMAR e num curso do Paulo

  1. Existe um viés entre as estimativas amostrais e as geoestatísticas da abundância por ano ! Esta questão está relacionada com o efeito do GLM que produz a abundância calibrada e com a utilização do \beta do modelo espacial para estimar a abundundância. Na prática esta é uma medida diferente da média amostral e não tem que ter a mesma magnitude. Por outro lado o modelo espacial reduz a influência dos clusters de observações muito elevadas o que não acontece com a média amostral, que é muito influenciada por observações extremas. Isto está incuido no paper.
  1. Na modelação das composições podia utilizar-se a profundidade para melhorar o ajuste do modelo ! É um desenvolvimento do modelo que deve sr considerado no futuro. Neste trabalho não foi incluido porque a profundidade é redundante com a longitude/latidude devido à forma da costa portuguesa e achei que não deviamos utilizar a mesma informação duas vezes. No entanto, sendo os dois modelos independentes esse não seria um problema. Deviamos incluir um comentário a esta questão.

CJFAS: 1ª Revisão

J20558 – Referee #1

The paper is not well written. I have listed major comments below. More specific comments and some grammatical corrections are inserted directly in the attached pdf file. The authors advocate a fairly complex model for limited data. They need to better measure and describe the advantages of their proposed approach versus the simpler and commonly used design-based approach.

<note> ToDo: spatial patterns, best variance estimation, more knowledge by addressing different perspectives of abundance, etc </note>

1. The authors estimate age compositions separately from spatial stock density, or more specifically the component of stock density sampled by the trawl. However, this will not usually be appropriate because there will be both spatial variations in stock density and stock age compositions. For example, if juveniles are distributed closer to shore, near or in nursery areas, then one cannot estimate the age composition separately from spatial stock density. This is common for many groundfish species. The authors recognize that fish tend to distribute differently by size categories, and they should also recognize that the local abundance of the size categories will be different as well, with greater numbers of smaller sized fish for a species. For example, consider the very simple situation of separate inshore and offshore areas, within which a species is homogeneously distributed in two size classes: 100 small and 900 large offshore, and 3900 small and 100 large inshore. The combined age composition for both areas is (S,L)=(0.8,0.2), whereas averaging the age compositions for the two regions gives (S,L)=(0.54,0.46). The latter result is wrong because the total age-composition for both regions should be computed as a weighted-average, where the weights depend on total stock numbers in each region. L162-163. The authors describe a procedure to check if age-proportions are related to stock density. It seems incomplete. If local abundance (C_ih) and age proportions (P_ijh) are statistically independent then the proposed approach is OK. However, testing for independence by fitting a model with total catch as a covariate may not be enough.

A: Following the comments from both referees regarding the strong assumptions about independence between age structure and abundance level, we developed the model to work out the age proportions as a weighted-average, removing the possible bias.

  1. 2. The authors need to better describe the procedure used to calibrate the survey data. For example, if there are no covariate effects so that the GLM contained only a constant intercept term, would the calibrated and un-calibrated data be identical? If not, then the authors should defend why this is appropriate.

A: After evaluating the gains and loses of including this method for filtering effects not related with abundance, we decided to remove this analysis from the paper. It is contributing to make the model more complex and is distracting the reader from the major objective (see letter to editor about the focus of the paper) blurring the main message.

3. The authors should ground-truth their proposed methods using some simulations. They should assess if their methods produce mean or median unbiased estimates, and if their 95% credibility intervals have a frequentist interpretation (i.e. cover the true values 95% of the time).

A: Under the new specification it is clear that the procedure is not biased. Both methods, model based geostatistics and compositional data analysis, are validated statistically so there's no reason why the combination of both shouldn't give valid results. On the other hand a simulation study is outside the scope of the paper and will contribute to increase its size significantly and more difficult to follow. Anyway we're providing a limited simulation study to show that the method is providing valid results.

Simulation study page

An alternative ideia for simulating datamaybe we have here an ideia for a new model!!!

Specific comments from PDF file:

4. lines 6 - 8 “methods, providing means to overcome difficulties in obtaining the analytical expression of abundance at age.” This sentence is too vague to be useful. What problems are overcome?

A: Sentence reviewed.

5. lines 13 – 14 “provide an overview of abundance along different perspectives.” This sentence is too vague to be useful.

A: Sentence reviewed.

6. line 45 - correlations will have nothing to do with the modeling methods.

A: Sentence reviewed.

7. line 57 - A general style comment. Sections do not present anything - they are just places where text is presented.

<note> ToDo </note>

8. line 94 - Sample size was said to be limited to 97 hauls per year; however, there are usually fewer hauls than this reported in Table 1. Why the difference? Also, if there are 48 strata then there needs to be at least 96 hauls to achieve 2 hauls per strata. Clearly in most years many strata had no or one haul. How were design-based standard deviations computed in this case?

A: The sampling programme is not always possible to fulfill do to operational constraints in which case the variance is computed with a linear regression between variance and mean computed for the strata with 2 samples. This is now explained in the manuscript.

9. line 97 - How were ages determined? Were age-length keys used? Were ages estimated or measured for each fish. This should be described.

A: Yes, we use ALK. This is now explained in the manuscript.

10. line 109 “and taking into account the nature of each one” what does this mean? For example, how is abundance taken into account when estimating P_i?

A: The objective of this sentence was referring to the fact that the models were adjusted to the the properties of the variables and what they represented. Numbers at age for the spatial behaviour and proportions for the population structure. This is now explained in the manuscript.

11. line 115 - Should give a “heads-up” that the choice of a will be described later.

12. line 120 “μˆi = μ¯i, the vector of marginal arithmetic means” This will usually not be appropriate. See Major Comment 1, attached.

A: See answer to Q1.

13. lines 131 – 132 “the reference conditions and adding the deviance residuals” This requires further explanation. See note 2 in attachment.

A: See answer to Q2.

14. lines 133 – 134 A NB GLM will also be sensitive to large catch. Keep in mind that the mle of the NB mean is the sample mean, which is not robust.

A: See answer to Q2.

15. lines 161 – 163 The second model needs to be described better. Write it down.

A: The manuscript was revised to account for this comment.

16. lines 165 – 168 I did not understand this. What is meant by inducing a small average change? Try being simple. Are the results sensitive to a difference choice of a and the constant. What happens if a=5 and the constant=0.01?

A: The results are not dependent on the age chosen. The null values are a problem like with other log models and there is not a simple way of dealing with it. Our approach was to use the multiplicative replacement strategy (). The rationale is that hake is spread along the coast at all depths and the null observations are likely to derive from a limit detectability of the gear. This is now explained in the manuscript.

17. line 170 A dome in the age-proportions does not mean survey catchability is domed. The right-hand part of the curve may decrease because of mortality and not catchability.

A: The manuscript was revised to account for this comment.

18. line 191 - Need to better describe the rationale for this. It seems to be that the authors are potentially removing variability in the calibrated observations, and this would not get captured in the Bayesian inferences. But perhaps I have missed something. If so, the authors should improve their description of the procedure. See Note 2 above.

A: See answer to Q2.

19. line 192 “Geostatistical analysis adopted” poor style

A: The manuscript was revised to account for this comment.

20. line 203 - Seems odd to use a discrete distribution for a variance parameter prior. The rationale for the choice should be described.

<note> PAULO: precisamos ver isto </note>

21. lines 204 – 205 “These probabilities…..0 and 2” Not clear what is going on here. Describe better.

<note> PAULO: precisamos ver isto </note>

22. line 215 - This seems too subjective. I think for survey analysis that people like more objective inferences. How sensitive are the statistical inferences (medians and credible intervals) to the choice of priors? Is it a problem? See Major note 3 above.

A: A sensitivity analysis was carried out and the impact is not high but in those years with less information about the parameters, in particular \tau^2, different priors will result in different posteriors once that the data are not able to update the prior. Also see answer to Q3.

23. line 218 - t would be better to defend this when introducing the GLM. Explain why the log link is better to use.

A: See answer to Q2.

24. line 224 - Describe how the design-based standard error were computed, particularly when the sampling design was changed to systematic since 2005? Also, as mentioned previously, it seems that there are many strata with less than 2 samples.

A: See answer to Q8.

25. line 227 - what values? Y or RMAD. The precision is higher.

<note> ToDo </note>

26. line 232 - This does not seem to be a good reason. I think you could also argue that groups of null catches would get less weight in the geostatistical analysis, which would lead to higher estimates compared to the sample mean. A more convincing explanation is required.

A: The manuscript was revised to account for this comment. An analysis comparing the distance between locations within each strata with the estimate of \phi is included were it is shown that the observations are likely to be correlated and the variance under-estimated.

27. line 233 – why was the higher precision obtained with design estimators apparently over-optimistic for BTS?

A: See answer to Q27.

28. line 235 - But the designed-based approach is highly stratified (less than 2 observations per strata). There can be little residual correlation in the responses in this situation. The mean-model has 48 parameters (i.e. the strata) in the design-based approach. I am again unconvinced by this explanation, more is required.

A: See answer to Q27.

29. line 249 - So does the design-based approach?

30. line 250 - and I also do not accept it, for the same reasons.

31. line 251 - I would prefer the author show the unstandardized results.

A: The manuscript was revised to account for this comment.

32. line 259 - Defend why this is an improvement.

33. line 274 - vague text

A: The manuscript was revised to account for this comment.

34. line 281 – “supporting our decision on exploratory data analysis” What does this mean? Explain.

A: The manuscript was revised to account for this comment.

35. line 287 - This again raises the issue of robustness to assumptions. The authors are advocating a fairly complex model for limited data?? What are the advantages? See major note 3 above.

J20558 – Referee #2

General comments:

a) The authors proposed a new methodology to estimate abundance at age from trawl surveys and obtained different results from an existing method. However, the current manuscript does not show clearly the novelty and superiority of this new method compared to others. I would prefer structure in the introduction section, in which problems regarding existing methods are pointed out, if any, and then new methods are proposed as a solution. It would also be necessary to emphasize the generality of this methodology.

b) The two sub-models of this study analyze separately the same age composition data from trawl surveys and results from the two are integrated at the final stage. My major concern is that the age composition analysis does not consider spatial correlation in estimating age-structure in each year. As the authors point out in line 235, ignoring spatial correlation is likely to lead to an underestimation of variances. In particular, since sampling designs, including sampling locations, were changed in 2005, spatial effects should be incorporated by some kind of method in estimating age structures. In addition, if hake distribution is highly dependent on age, it would be proper to apply geostatistical models to different ages instead of age-aggregated data. Thus, I am suspicious about the validity of this new methodology, in which the two sub-models analyze the same data independently. I think that a statistically rigorous model to analyze age and spatial effects simultaneously is necessary to solve issues the authors raised.

<note> There is not yet a solution to the problem of CDA in space. P. had some advances but still based on traditional geostats that creates other problems. </note>

c) The authors often refer to the small sample size of the BTS as a reason that more complicated models are difficult to apply. However, one of the major advantages of Bayesian methods is that they can deal with small data sets by incorporating proper prior information. I suspect that this model does not sufficiently take advantage of Bayesian approaches.

d) The length of this manuscript is compact and I prefer such shorter papers. However, this manuscript is not easy to understand as to what the authors really did in this analysis. For example, what is meant by “abundance” that is used frequently in the text, though I assumed “abundance in number”. Additionally, some texts in the results section (lines 163-168, 175-181, 192-193, 199-210, 216-218 etc) should be included in the methods section. I would prefer an explanation style regarding geostatistics used in Jardim and Ribeiro (2007).

Specific comments:

Lines 55-67. Most of the last paragraph of the introduction section is redundant and, if necessary, some texts should be moved to the material section.

Lines 80-95. I would like to clarify one point regarding sample design: was sampling conducted in all 4 depth ranges in each location? Or depth was selected randomly as well? If the latter is correct and hake abundance and/or age composition depend on trawl depth, how was this depth effect standardized through this analysis?

Lines 104-124. Subscripts of parameters should be explained more carefully, though most of them might be expected. For example, what is “n” and “m” in line 106 and “H” in line 113? A parameter P has subscripts “ij” in line105 and “ijh” in line 113. H has subscript i in line 121. These are very confusing.

Lines 112-114. My understanding is that this model does not consider abundance (sample size) differences among locations in analyzing age composition. If sample size is too small as a representative value in a location, such age composition data could impact final results wrongly. Did the authors conduct any pre-treatments like omitting data sets with small sample sizes?

Lines 114-116. The explanation is unclear.

Lines 122-124. When parametric bootstrap is conducted, back-transformed P will not be between 0 and 1 in some cases. Were any constraints placed to D in the bootstrap?

Lines 161-163. I did not quite follow the explanation. What did the authors actually conduct?

Line 167. What is the unit of “3”?

Lines 171-173. As the authors point out, ageing errors seem to be large for hake. It would have been interesting to know the impact of this source of uncertainty. Even if this factor is difficult to incorporate in the model, it would be nice to add information on how large ageing errors potentially are.

Line 175. Diagnostics regarding model fitting should be shown.

Line 192. What is the reason for selecting the exponential correlation function? If there is not strong evidence, it would be necessary to explore sensitivity to other function forms.

Lines 196-198. Probably the authors' judgment is proper, but the “90 degree rotation” does not really impact the final results?

Lines 204-205. This sentence is not clear.

Line 209. What is “flat prior”? What is the range of the prior distribution?

Line 213. The data did not update tau distribution considerably. In this case, it would be better to use different prior distribution functions as sensitivity tests.

Line 226. Replace “seem” with “seen”.

Lines 216-238. The authors consider that this geostatistical model showed considerably lower estimates than the design statistics due to “screen effect”. However, they do not provide a distinct reason why this new model is considered to be more proper. Though I realize this may not be easy to implement, methods such as cross validation and operating model approach can be used to show the geostatistical model works effectively and reasonably.

Line 236. The unit of “14 and 25” is “%”?

Lines 255-257. This sentence is unclear.

Lines 280-281. It would be better to elaborate on what was done in more detail, since this point may be critical.

Tables 1,2. What is the unit of abundance?

Figure 4. The order of sub-panels looks strange to me (1998-2006 and 1989-1997). Lower panels should be moved to the upper section to arrange panels in the order of time.

Figure 5. It might be nice to add design-based estimates to the figure.

Figure 7. The order of sub-panels looks strange to me (3, 4, 5, 0, 1, 2). If possible, it would be nice to include information on estimate uncertainties.

J20558 – Associate Editor advice

The two reviewers have identified major flaws in the MS and recommend rejection. However, both reviewers encourage resubmission of a new manuscript. Both reviewers also found that the MS was poorly prepared and hard to follow. I agree with the reviewers’ assessment and recommend that the MS be rejected in its current form, but submission of a new MS encouraged. The revised MS would be treated as a new submission, and a response letter is needed from the authors detailing how they address the reviewers' comments.

<note> Model based VS design based (REF01) Model developed to sort out possible bias.

</note>


QR Code
QR Code artigos:ernesto3 (generated for current page)