期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	48篇
免费	38篇
国内免费	19篇

专业分类

航空	58篇
航天技术	14篇
综合类	10篇
航天	23篇

出版年

2024年	2篇
2023年	10篇
2022年	19篇
2021年	21篇
2020年	10篇
2019年	3篇
2018年	4篇
2017年	3篇
2016年	2篇
2015年	2篇
2013年	4篇
2012年	2篇
2011年	1篇
2010年	1篇
2009年	2篇
2008年	2篇
2007年	2篇
2006年	3篇
2005年	2篇
2004年	5篇
2003年	2篇
2002年	2篇
1997年	1篇

排序方式： 共有105条查询结果，搜索用时 19 毫秒

[首页] « 上一页 [2] [3] [4] [5] [6] [7] [8] [9] 10 [11] 下一页 » 末页»

91.

Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

《中国航空学报》2023,36(5):377-391

As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm. 相似文献

92.

基于深度强化学习的无人机栖落机动控制策略设计

下载免费PDF全文

黄赞何真仇靖雯《导航定位与授时》2022,(6):25-32

无人机栖落机动飞行是一种无需跑道的降落方法，能够提升无人机在复杂环境下执行任务的适应能力。针对具有高非线性、多约束特性的无人机栖落机动过程，提出了一种基于模仿深度强化学习的控制策略设计方法。首先，建立了固定翼无人机栖落机动的纵向非线性动力学模型，并设计了无人机栖落机动的强化学习环境。其次，针对栖落机动状态动作空间大的特点，为了提高探索效率，通过模仿专家经验的方法对系统进行预训练。然后，以模仿学习得到的权重为基础，采用近端策略优化方法学习构建无人机栖落机动的神经网络控制器。最后，通过仿真验证了上述控制策略设计方法的有效性。相似文献

93.

一种无人机自主避障与目标追踪方法

下载免费PDF全文

江未来徐国强王耀南《宇航学报》2022,43(6):802-810

针对无人机自主避障与目标追踪问题，以深度Q网络(DQN)算法为基础，提出一种多经验池深度Q网络(MP DQN)算法，使无人机避障与追踪的成功率和算法的收敛性得到优化。更进一步，赋予无人机环境感知能力，并在奖励函数中设计了方向奖惩函数，提升了无人机对环境的泛化能力以及算法的整体性能。仿真结果表明，相较于DQN和双重DQN(DDQN)算法，MP DQN算法具有更快的收敛速度、更短的追踪路径和更强的环境适应性。相似文献

94.

An efficient hyper-elastic model for the preforming simulation of Carbon-Kevlar hybrid woven reinforcement

《中国航空学报》2022,35(12):321-335

An efficient hyper-elastic model that can reflect the primary mechanical behaviors of Carbon-Kevlar hybrid woven reinforcement was developed and implemented with VUMAT constitutive code for preforming simulation. The model parameters were accurately determined through the uniaxial and bias-extension tests. To calibrate the simulation code, preforming experiments of hybrid woven reinforcement over the hemisphere mold and tetrahedron mold were respectively conducted to validate the proposed hyper-elastic model. The comparison between the simulations and experiments shows that the model can not only accurately capture shear angle distribution and geometry shape after deformation, but also accurately predict the force–displacement curve and potential fiber tensile failure during the preforming process. This result indicates that the proposed model can be used to predict the preforming behavior of Carbon-Kevlar hybrid woven reinforcement, and simulate its manufacturing process of complicated geometry. 相似文献

95.

考虑移动目标不确定行为方式的轨迹预测方法

下载免费PDF全文

颜鹏郭继峰白成超《宇航学报》2022,43(8):1040-1051

针对现有方法难以预测出符合飞行移动目标不确定行为方式轨迹的问题,提出基于逆强化学习的飞行移动目标轨迹预测方法,通过学习目标行为偏好以及模拟目标行为决策过程的方式预测目标的移动轨迹。首先基于深度神经网络建立目标的行为决策模型与行为偏好模型,然后通过最大熵逆强化学习方法交替地学习模型参数。为了有效地学习目标的不确定行为特征,采用监督学习的方法学习出目标示例轨迹概率分布模型,用于指导目标行为偏好模型的训练以及初始化目标行为决策模型,同时通过对目标行为偏好模型进行预训练的方式提高其训练质量。仿真结果表明,提出的飞行移动目标轨迹预测方法可通过学习到的目标行为决策模型较为准确地模拟目标的行为方式,预测的目标轨迹分布与真实的目标轨迹分布在Kullback Leibler(KL)散度下的相似度可达0.24。相似文献

96.

基于深度强化学习的四足机器人后空翻动作生成方法

下载免费PDF全文

李岸荞王志成古勇吴俊朱秋国《导航定位与授时》2021,8(6):35-42

四足机器人灵巧运动技能的生成一直受到机器人研究者们的广泛关注,其中空中翻滚运动既能展现四足机器人运动的灵活性又具有一定的实用价值.近年来,深度强化学习方法为四足机器人的灵巧运动提供了新的实现思路,利用该方法得到的闭环神经网络控制器具有适应性强、稳定性高等特点.本文在绝影Lite机器人上使用基于模仿专家经验的深度强化学习方法,实现了仿真环境中四足机器人的后空翻动作学习,并进一步证明了设计的后空翻闭环神经网络控制器相比于开环传统位置控制器具有适应性更高的特点. 相似文献

97.

边缘网络下多无人机协同计算和资源分配联合优化策略

郭永安王宇翱周沂房晶晶任保全《南京航空航天大学学报》2023,55(5):757-767

针对边缘网络环境下多人机之间存在计算负载不均，造成卸载任务失败的问题，提出了一种多无人机间协作的智能任务卸载方案。通过联合考虑多无人机任务分配、计算资源分配和无人机飞行轨迹，引入公平性指数建立了无人机公平负载最大化和能量消耗最小化问题。基于多智能体深度强化学习框架，提出了融合轨迹规划和任务卸载的分布式算法。仿真结果表明，所提出的多无人机协作方案可以显著提高任务完成率和负载公平度，并且有效适用于大规模用户设备场景。相似文献

98.

Locally generalised multi-agent reinforcement learning for demand and capacity balancing with customised neural networks

《中国航空学报》2023,36(4):338-353

Reinforcement Learning (RL) techniques are being studied to solve the Demand and Capacity Balancing (DCB) problems to fully exploit their computational performance. A locally generalised Multi-Agent Reinforcement Learning (MARL) for real-world DCB problems is proposed. The proposed method can deploy trained agents directly to unseen scenarios in a specific Air Traffic Flow Management (ATFM) region to quickly obtain a satisfactory solution. In this method, agents of all flights in a scenario form a multi-agent decision-making system based on partial observation. The trained agent with the customised neural network can be deployed directly on the corresponding flight, allowing it to solve the DCB problem jointly. A cooperation coefficient is introduced in the reward function, which is used to adjust the agent’s cooperation preference in a multi-agent system, thereby controlling the distribution of flight delay time allocation. A multi-iteration mechanism is designed for the DCB decision-making framework to deal with problems arising from non-stationarity in MARL and to ensure that all hotspots are eliminated. Experiments based on large-scale high-complexity real-world scenarios are conducted to verify the effectiveness and efficiency of the method. From a statistical point of view, it is proven that the proposed method is generalised within the scope of the flights and sectors of interest, and its optimisation performance outperforms the standard computer-assisted slot allocation and state-of-the-art RL-based DCB methods. The sensitivity analysis preliminarily reveals the effect of the cooperation coefficient on delay time allocation. 相似文献

99.

拦截大气层内机动目标的深度强化学习制导律

下载免费PDF全文

邱潇颀高长生荆武兴《宇航学报》2022,43(5):685-695

针对大气层内高速机动目标的拦截问题，提出了一种基于双延迟深度确定性策略梯度(TD3)算法的深度强化学习制导律，它直接将交战状态信息映射为拦截弹的指令加速度，是一种端到端、无模型的制导策略。首先，将攻防双方的交战运动学模型描述为适用于深度强化学习算法的马尔科夫决策过程，之后通过合理地设计算法训练所需的交战场景、动作空间、状态空间和网络结构，并引入奖励函数整形和状态随机初始化，构建了完整的深度强化学习制导算法。仿真结果表明：与比例导引和增强比例导引两种方案相比，深度强化学习制导策略在脱靶量更小的同时能够降低对中制导精度的要求；具有良好的鲁棒性和泛化能力，并且计算负担较小,具备在弹载计算机上运行的条件。相似文献

100.

基于强化学习的航天器姿态控制器设计

下载免费PDF全文

张瑞卿钟睿徐毅《上海航天》2023,40(1):80-85

航天器在轨执行某些任务时，其质量参数会发生未知变化，传统控制方法在这种情况下控制效果不佳。本文提出基于强化学习的航天器姿态控制器设计方法，该方法在姿态控制器训练过程中不需要对航天器进行动力学建模，不依赖航天器的质量参数。当质量参数发生较大未知变化时，训练好的控制器仍然可以保持较好的控制效果。仿真测试表明:使用基于强化学习方法训练的控制器确实具有良好的鲁棒性。此外，回报函数的设计会明显影响姿态控制器的训练，因此对不同的回报函数设计进行了研究。相似文献

[首页] « 上一页 [2] [3] [4] [5] [6] [7] [8] [9] 10 [11] 下一页 » 末页»