模型简介
<!-- Provide a quick summary of what the model is/does. -->
该模型是用于金属与肿瘤免疫治疗预测的多模态临床辅助决策模型,经过LLaMA 33B 微调。
- 上游数据输入加入ResNet151 对肿瘤组化染色切片的进行免疫相应的特征工程,并输出特征矩阵
- 上游数据输入加入多组学融合单元 MultiOmics VAE,对电子健康记录进行信息萃取,并输出特征向量
- 在LLM下游经过Teaching model 检查并交由人类专家确认
- 当前已在多家医院进行封闭测试,用于改进安全性和隐私性
- 团队正在致力于提高模型的稳定性与推理速度,模型将在2024年春天开源
模型描述
模型来源信息
<!-- Provide a longer summary of what this model is. -->
- Developed by: [Nantong Univeristy]
- Shared by : [Xuehai Wang]
- Model type: [LLM with EHR enhanced]
- Language(s) (NLP): [Chinese]
- Finetuned from model: [LLaMA 33B]
Model Sources [optional]
<!-- Provide the basic links for the model. -->
- Repository: [https://huggingface.co/bavest/fin-llama-33b-merged]
- Paper [optional]: [https://ai.facebook.com/blog/large-language-model-llama-meta-ai/]
Uses
<!-- Purpose: The model is designed to assist healthcare professionals and researchers in analyzing medical data related to cancer. It can provide insights and predictions based on the input data and patterns learned from previous data. The intended purpose is to support decision-making processes and provide additional information to medical experts. Users: The foreseeable users of the model may include oncologists, radiologists, medical researchers, and other healthcare professionals who specialize in cancer diagnosis and treatment. These individuals can leverage the model to gain insights into the characteristics of cancerous tissues, potentially aiding in treatment planning and assessing the response to therapy. Data: The model relies on medical data, such as imaging scans (e.g., PET, CT, or MRI) and related patient information, to analyze whether cancer is "hot" or "cold." This data typically includes images, clinical reports, patient history, and potentially molecular or genetic information. The model learns from this data to recognize patterns and make predictions. It's crucial to ensure that the data used to train the model is representative, diverse, and of high quality to achieve accurate and reliable results. Collaboration: The model can be used in collaboration with healthcare professionals and researchers who have expertise in interpreting and validating its outputs. It is not meant to replace human expertise but rather to augment it. The responsibility for making final decisions about patient care remains with the medical professionals. Ethical considerations: Any model that deals with sensitive medical data must prioritize privacy, security, and ethical considerations. Adhering to relevant data protection regulations and obtaining appropriate consent from patients is essential. Additionally, potential biases in the training data or model predictions should be carefully addressed to ensure fairness and avoid any unintended discrimination. -->
Out-of-Scope Use
<!--Misuse: The model should be used responsibly and within the intended scope. It should not be used for purposes it was not designed for or in situations where its accuracy and reliability have not been properly evaluated. Misuse of the model could lead to incorrect diagnoses, ineffective treatments, or other potential harms to patients. Malicious use: Safeguards should be implemented to prevent malicious use of the model. Access to the model and associated data should be restricted to authorized individuals or organizations. Security measures, such as encryption and secure data storage, should be employed to protect patient information and prevent unauthorized access or data breaches. Limitations: It's important to recognize the limitations of the model. AI models have constraints and may not work well in certain situations. Some limitations may include: a. Lack of generalizability: The model's performance may be influenced by the specific datasets used for training. It may not generalize well to populations or data types that differ significantly from the training data. b. Uncertainty: AI models typically provide predictions or probabilities, and there is inherent uncertainty associated with these predictions. The model should not be solely relied upon for critical decisions without considering other clinical factors and expertise. c. Data quality and biases: The model's performance can be affected by the quality and representativeness of the training data. Biases present in the data, such as underrepresentation of certain demographics or data sources, can lead to biased predictions or outcomes. d. Dynamic nature of medicine: Medical knowledge and practices evolve over time. The model's performance may degrade or become outdated as new research and techniques emerge. Regular monitoring and updating of the model are necessary to ensure its ongoing relevance and accuracy. Use cases with limited applicability: The model may not work well or provide meaningful insights in certain scenarios. For example, if the input data is incomplete, of low quality, or doesn't include relevant information, the model's outputs may be unreliable. It's crucial to identify the appropriate contexts and data requirements where the model can be effectively applied. -->
Bias, Risks, and Limitations
<!-- Technical Limitations:
Data quality and availability: The accuracy and reliability of AI models heavily depend on the quality, completeness, and representativeness of the data used for training. If the training data is limited, biased, or of poor quality, it can negatively impact the model's performance and generalizability. Interpretability and explainability: AI models, especially complex ones like deep learning models, can be challenging to interpret and explain. They often work as black boxes, making it difficult to understand the reasoning behind their predictions. In the medical domain, interpretability is crucial for gaining trust and acceptance from healthcare professionals who need to understand the underlying factors contributing to the model's outputs. Limited knowledge transfer: AI models are trained on specific tasks and datasets. They may struggle to transfer knowledge or generalize their learnings to new contexts or domains. When applying a model to a different population or medical condition, careful evaluation and adaptation are necessary to ensure its effectiveness. Sociotechnical Limitations:
Ethical considerations: AI models should adhere to ethical principles, such as privacy, security, and fairness. Ensuring that patient data is protected, informed consent is obtained, and potential biases in the data or model are addressed is crucial to maintain trust and avoid discrimination. Human-AI collaboration: The role of AI models in medical decision-making should be seen as a collaboration with healthcare professionals rather than a replacement. The limitations and uncertainties of AI models necessitate human oversight and expertise to make informed decisions, interpret outputs, and consider broader clinical and ethical factors. Regulatory and legal considerations: The use of AI models in healthcare is subject to regulatory frameworks and legal requirements. Compliance with data protection regulations, medical device regulations, and other relevant standards must be ensured to avoid legal pitfalls and ensure patient safety. Adoption and acceptance: Integrating AI models into clinical practice requires acceptance and adoption by healthcare professionals. Resistance to change, lack of familiarity with AI technologies, and concerns about reliability and accountability can hinder the widespread adoption of these models. Resource constraints: Deploying and maintaining AI models may require significant computational resources, technical infrastructure, and expertise. Limited access to such resources, especially in resource-constrained healthcare settings, can pose challenges to the widespread implementation of AI models. -->
Training Hyperparameters
- Training regime: <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [X86]
- Hours used: [300 hours]
Compute Infrastructure
A800*8
Hardware
Own Sever
Software
PyTorch2.0, Ubuntu,CUDA